On 4/28/23 9:28 PM, Grisha Levit wrote:
On Fri, Apr 28, 2023, 11:35 Chet Ramey <chet.ra...@case.edu
<mailto:chet.ra...@case.edu>> wrote:
On 4/24/23 1:40 AM, Grisha Levit wrote:
> The history expansion code can end up reading past the end of the
> input line buffer if the line ends with an invalid multibyte sequence:
Thanks for the report. You mean an incomplete multibyte character, I think.
Well I'm not quite sure. The (piped) input needs to have an invalid
sequence (two leading bytes) but readline transforms this invalid sequence
into a just a single leading byte.
I'm just looking at the code. The only place that increments i in that loop
is the case where _rl_get_char_len returns -2. That reflects the return
value of mbrlen, which returns -2 to indicate a valid (so far) but
incomplete multibyte sequence. If the sequence were invalid, it would
return -1, and the loop would break.
Piping input that simply ends in an leading byte doesn't trigger the issue
-- that byte byte don't seem to make it into the input line.
This is a bit off topic, but I don't really understand what happens with
invalid input sequences in the input, see e.g.:
They should be treated as individual bytes.
$ bash --norc -i 2>/dev/null <<<$'printf %q\\\\n \240\340'
$'\240'
$ bash --norc -i 2>/dev/null <<<$'printf %q\\\\n \240\340.'
$'\240.'
$ bash --norc -i 2>/dev/null <<<$'printf %q\\\\n \240\340.\341'
$'\240.\340'
I can't reproduce that with a simplified case, so maybe it's readline:
$ printf '%q\n' $'\240\340'
$'\240\340'
$ printf '%q\n' $'\240\340.'
$'\240\340.'
$ printf '%q\n' $'\240\340.\341'
$'\240\340.\341'
$ echo $BASH_VERSION
5.2.15(6)-maint
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU c...@case.edu http://tiswww.cwru.edu/~chet/