When the read builtin is invoked with -n/-N <nchars>, the documentation specifies that at most <nchars> characters will be read from stdin. This statement is not true when stdin emits null characters: read discards the null character and keeps reading without incrementing its counter, continuing until it has consumed <nchars> non-null characters. This leads to infinite running time when the input stream has infinite null characters and not enough non-null characters: for example, `read -N 1 </dev/zero` never terminates.
This patch aligns the read builtin's actual behavior with its documented behavior by teaching the read builtin to count null characters toward the <nchars> maximum. It also updates the documentation to describe explicitly how the read builtin behaves when it encounters null characters - both the fact that it discards them, and that the null characters (as of this patch) will count toward the -n/-N maximum. --- Notes: I earlier tried to submit a bug report using bashbug but I am not sure if it came through. Sending as a patch now. Sorry for the duplicate if it is that. If I included a patch in the bug report, this one is the same except that it also updates documentation, so I suggest using this instead. builtins/read.def | 3 ++- doc/bash.1 | 3 +++ tests/read.right | 1 + tests/read3.sub | 6 ++++++ 4 files changed, 12 insertions(+), 1 deletion(-) diff --git a/builtins/read.def b/builtins/read.def index b57c8c398e18..5713766e4c3a 100644 --- a/builtins/read.def +++ b/builtins/read.def @@ -672,7 +672,7 @@ read_builtin (list) break; if (c == '\0' && delim != '\0') - continue; /* skip NUL bytes in input */ + goto increment_byte_count; /* skip NUL bytes in input */ if ((skip_ctlesc == 0 && c == CTLESC) || (skip_ctlnul == 0 && c == CTLNUL)) { @@ -713,6 +713,7 @@ add_char: } #endif +increment_byte_count: nr++; if (nchars > 0 && nr >= nchars) diff --git a/doc/bash.1 b/doc/bash.1 index e6cd08db3867..f1a72e1ae4cf 100644 --- a/doc/bash.1 +++ b/doc/bash.1 @@ -9138,6 +9138,7 @@ are used to split the line into words using the same rules the shell uses for expansion (described above under \fBWord Splitting\fP). The backslash character (\fB\e\fP) may be used to remove any special meaning for the next character read and for line continuation. +Null characters are discarded. Options, if supplied, have the following meanings: .RS .PD 0 @@ -9178,6 +9179,7 @@ buffer before editing begins. \fBread\fP returns after reading \fInchars\fP characters rather than waiting for a complete line of input, but honors a delimiter if fewer than \fInchars\fP characters are read before the delimiter. +Null characters count toward this limit. .TP .B \-N \fInchars\fP \fBread\fP returns after reading exactly \fInchars\fP characters rather @@ -9189,6 +9191,7 @@ not treated specially and do not cause \fBread\fP to return until The result is not split on the characters in \fBIFS\fP; the intent is that the variable is assigned exactly the characters read (with the exception of backslash; see the \fB\-r\fP option below). +Null characters count toward this limit. .TP .B \-p \fIprompt\fP Display \fIprompt\fP on standard error, without a diff --git a/tests/read.right b/tests/read.right index 73cb7042fbca..839c532f7c67 100644 --- a/tests/read.right +++ b/tests/read.right @@ -45,6 +45,7 @@ abcde abc ab abc +ac # while read -u 3 var do diff --git a/tests/read3.sub b/tests/read3.sub index af41e3f27930..02334fdad6d2 100644 --- a/tests/read3.sub +++ b/tests/read3.sub @@ -20,5 +20,11 @@ echo abc | { echo $foo } +# does not consume too many characters +echo abcde | tr b '\0' | { + read -N 3 foo + echo $foo +} + read -n 1 < $0 echo "$REPLY" -- 2.20.1