On 4/28/23 9:28 PM, Grisha Levit wrote:


On Fri, Apr 28, 2023, 11:35 Chet Ramey <chet.ra...@case.edu <mailto:chet.ra...@case.edu>> wrote:

    On 4/24/23 1:40 AM, Grisha Levit wrote:
     > The history expansion code can end up reading past the end of the
     > input line buffer if the line ends with an invalid multibyte sequence:

    Thanks for the report. You mean an incomplete multibyte character, I think.


Well I'm not quite sure. The (piped) input needs to have an invalid sequence (two leading bytes) but readline transforms this invalid sequence into a just a single leading byte.

I'm just looking at the code. The only place that increments i in that loop
is the case where _rl_get_char_len returns -2. That reflects the return
value of mbrlen, which returns -2 to indicate a valid (so far) but
incomplete multibyte sequence. If the sequence were invalid, it would
return -1, and the loop would break.

Piping input that simply ends in an leading byte doesn't trigger the issue -- that byte byte don't seem to make it into the input line.

This is a bit off topic, but I don't really understand what happens with invalid input sequences in the input, see e.g.:

They should be treated as individual bytes.


$ bash --norc -i 2>/dev/null <<<$'printf %q\\\\n \240\340'
$'\240'
$ bash --norc -i 2>/dev/null <<<$'printf %q\\\\n \240\340.'
$'\240.'
$ bash --norc -i 2>/dev/null <<<$'printf %q\\\\n \240\340.\341'
$'\240.\340'

I can't reproduce that with a simplified case, so maybe it's readline:

$ printf '%q\n' $'\240\340'
$'\240\340'
$ printf '%q\n' $'\240\340.'
$'\240\340.'
$ printf '%q\n' $'\240\340.\341'
$'\240\340.\341'
$ echo $BASH_VERSION
5.2.15(6)-maint


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    c...@case.edu    http://tiswww.cwru.edu/~chet/


Reply via email to