On 2/25/19 5:42 PM, Olga Ustuzhanina wrote:
> On Mon, 25 Feb 2019 12:59:38 -0800
> L A Walsh <b...@tlinx.org> wrote:
> 
>> In this case, the decode of \xc2 doesn't swallow the following
>> character.
> 
> I want to clarify that \xc2 (and other characters in the range
> mentioned above) can only swallow a \0. Other characters are
> unaffected.

The other characters wouldn't be treated as a delimiter either. The \0
is `swallowed' because it's the C string terminator.

The \0 gets added to the input string, but it's not treated as a delimiter,
since it's part of the invalid multibyte sequence. Then the next character
is read, that \0 is treated as a delimiter, and the input string is
assigned to the variable, including the \0. That gets treated as a normal C
string terminator, since variable values can't contain NULs.

(This is why read discards \0 unless it's a delimiter. It would terminate
the value assigned to the variable.)

Bash-4.4 returned different results because it didn't attempt to validate
reading multibyte characters at all unless it was reading a fixed number of
characters.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    c...@case.edu    http://tiswww.cwru.edu/~chet/

Reply via email to