bash: read: how to capture 'n' (newline) character?

Can I use read to capture the n 12 or newline character?

Define test function:

f() { read -rd '' -n1 -p "Enter a character: " char &&
      printf "nYou entered: %qn" "$char"; }

Run the function, press Enter:

$ f;
Enter a character: 

You entered: ''

Hmmm. It’s a null string.

How do I get my expected output:

$ f;
Enter a character:

You entered: $'12'
$ 

I want the same method to be able to capture ^D or 04.

If read can’t do it, what is the work around?

Asked By: Tom Hale

||

bash builtin read can do this:

read -rd $'' -p ... -n 1
printf 'nYou typed ASCII %02xn' "'$REPLY"

(this piece of code won’t work with multi-byte characters)

Notice how I didn’t put what is read in variable char as you did. This is because doing so would remove the characters in IFS from char. With a standard IFS, you wouldn’t be able to differentiate between space, tab and newline.

Answered By: xhienne

To read 1 character, use -N instead, which reads one character always and doesn’t do $IFS processing:

read -rN1 var

read -rn1 reads one record up to one character and still does $IFS processing (and newline is in the default value of $IFS which explains why you get an empty result even though you read NUL-delimited records). You’d use it instead to limit the length of the record you read.

In your case, with NUL-delimited records (with -d ''), IFS= read -d '' -rn1 var would work the same as bash cannot store a NUL character in its variables anyway, so printf '' | read -rN1 var would leave $var empty and return a non-zero exit status.

To be able to read arbitrary characters including NUL, you’d use the zsh shell instead where the syntax is:

read -k1 'var?Enter a character: '

(no need for -r or IFS= there. However note that read -k reads from the terminal (k is for key; zsh’s -k option predates bash’s and even ksh93’s -N by decades). To read from stdin, use read -u0 -k1).

Example (here pressing Ctrl+Space to enter a NUL character):

$ read -k1 'var?Enter a character: '
Enter a character: ^@
$ printf '%qn' $var
$''

Note that to be able to read a character, read may have to read more than one byte. If the input starts with the first byte of multi-byte character, it will read at least one more byte, so you could end up with $var containing something that the shell considers having a length greater than 1 if the input contains byte sequences not forming valid characters.

For instance in a UTF-8 locale:

$ printf 'xfcx80x80x80x80XYZ' | bash -c 'read -rN1 a; printf "<%q>n" "$a" "${#a}"; wc -c'
<$'374200200200200X'>
<6>
2
$ printf 'xfcx80x80x80x80XYZ' | zsh -c 'read -u0 -k1 a; printf "<%q>n" $a $#a; wc -c'
<$'374'$'200'$'200'$'200'$'200'X>
<6>
2

In UTF-8, 0xFC is the first byte of a 6-byte long character, the 5 other ones meant to have the 8th bit set and the 7th bit unset, however we provide only 4. read still reads that extra X to try and find the end of the character which ends up in $var along with those 5 bytes that don’t form a valid character and end up being counted as one character each.

Answered By: Stéphane Chazelas
Categories: Answers Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.