Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem.

2022-02-07 Thread Alex fxmbsw7 Ratchev
On Tue, Feb 8, 2022 at 12:09 AM Ángel wrote: > > On 2022-02-07 at 11:55 +0100, Alex fxmbsw7 Ratchev wrote: > > > > however my solution still stays > > > > you just use memory locations instead of c strings > > > > and those entries in memory are of course of known length, before > > > > setting an

Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem.

2022-02-07 Thread Ángel
On 2022-02-07 at 11:55 +0100, Alex fxmbsw7 Ratchev wrote: > > > however my solution still stays > > > you just use memory locations instead of c strings > > > and those entries in memory are of course of known length, before > > > setting and all is fine > > > > "Your" solution is decades old. Ev

Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem.

2022-02-07 Thread Frank Heckenbach
> In the case of bash with environment having LC_CTYPE: C.UTF-8 or > en_US.UTF-8 > read: > 0xC3 (len=1) i.e. Ã ('A' w/tilde in a legacy 8-bit latin-compatible > charset), > but invalid if bash processes the environment setting of en_US.UTF-8. > > Should bash process it as legacy input or invali

Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem.

2022-02-07 Thread Alex fxmbsw7 Ratchev
On Mon, Feb 7, 2022 at 7:45 AM Lawrence Velázquez wrote: > > On Mon, Feb 7, 2022, at 1:26 AM, Alex fxmbsw7 Ratchev wrote: > > well i saw now, printf a char of "\0" results in 0 bytes out to wc -c > > % /usr/bin/printf '\0' | wc -c >1 > > > > however my solution still stays > > you just use

Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem.

2022-02-06 Thread Lawrence Velázquez
On Mon, Feb 7, 2022, at 1:26 AM, Alex fxmbsw7 Ratchev wrote: > well i saw now, printf a char of "\0" results in 0 bytes out to wc -c % /usr/bin/printf '\0' | wc -c 1 > however my solution still stays > you just use memory locations instead of c strings > and those entries in memory are of

Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem.

2022-02-06 Thread Alex fxmbsw7 Ratchev
On Mon, Feb 7, 2022 at 6:19 AM Lawrence Velázquez wrote: > > On Sun, Feb 6, 2022, at 11:53 PM, Alex fxmbsw7 Ratchev wrote: > > On Mon, Feb 7, 2022 at 12:02 AM Greg Wooledge wrote: > >> There are other programming languages besides bash. Some of them can > >> store NUL bytes internally, either by

Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem.

2022-02-06 Thread Lawrence Velázquez
On Sun, Feb 6, 2022, at 11:53 PM, Alex fxmbsw7 Ratchev wrote: > On Mon, Feb 7, 2022 at 12:02 AM Greg Wooledge wrote: >> There are other programming languages besides bash. Some of them can >> store NUL bytes internally, either by encoding and decoding them on the >> fly, or by not using C-style s

Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem.

2022-02-06 Thread Alex fxmbsw7 Ratchev
On Mon, Feb 7, 2022 at 3:37 AM Chet Ramey wrote: > > On 2/6/22 5:11 PM, Alex fxmbsw7 Ratchev wrote: > > i just have a small question here > > the dropping of null bytes is no friend of me and i understand you're > > there to skip it instead of process, which results in null bytes gone > > which is

Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem.

2022-02-06 Thread Alex fxmbsw7 Ratchev
On Mon, Feb 7, 2022 at 1:47 AM L A Walsh wrote: > > > > > On 2022/02/06 09:26, Frank Heckenbach wrote: > >> On 2022/01/02 17:43, Frank Heckenbach wrote: > >> > >> > >>> Why would you? Aren't you able to assess the severity of a bug > >>> yourself? Silent data corruption is certainly one of the mos

Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem.

2022-02-06 Thread Alex fxmbsw7 Ratchev
On Mon, Feb 7, 2022 at 12:02 AM Greg Wooledge wrote: > > On Sun, Feb 06, 2022 at 11:11:43PM +0100, Alex fxmbsw7 Ratchev wrote: > > [[ Regarding nul bytes discarded by command substitution ]] > > can't these \0 bytes be encoded at least when a utf8 locale is used as > > \u0 instead of dropping ? a

Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem.

2022-02-06 Thread Chet Ramey
On 2/6/22 5:11 PM, Alex fxmbsw7 Ratchev wrote: i just have a small question here the dropping of null bytes is no friend of me and i understand you're there to skip it instead of process, which results in null bytes gone which is not much of an use can't these \0 bytes be encoded at least when a

Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem.

2022-02-06 Thread L A Walsh
On 2022/02/06 09:26, Frank Heckenbach wrote: On 2022/01/02 17:43, Frank Heckenbach wrote: Why would you? Aren't you able to assess the severity of a bug yourself? Silent data corruption is certainly one of the most severe kind of bugs ... --- That's debatable, BTW, as I was rem

Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem.

2022-02-06 Thread Robert Elz
Date:Sun, 6 Feb 2022 18:01:03 -0500 From:Greg Wooledge Message-ID: | I urge you to learn one of these other languages, and use it. | | Bash is a shell, not a full general-purpose programming language. It's | not suited to all tasks. Many other languages are

Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem.

2022-02-06 Thread Greg Wooledge
On Sun, Feb 06, 2022 at 11:11:43PM +0100, Alex fxmbsw7 Ratchev wrote: > [[ Regarding nul bytes discarded by command substitution ]] > can't these \0 bytes be encoded at least when a utf8 locale is used as > \u0 instead of dropping ? and a null, ... just > prefix the utf 8 encoding chars to the nul

Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem.

2022-02-06 Thread Alex fxmbsw7 Ratchev
a replacement sequence to null bytes i would find a solution to null bytes no i didnt understand these posts of these emails but i am just concerned about the null bytes not being dropped On Sun, Feb 6, 2022 at 11:16 PM Alex fxmbsw7 Ratchev wrote: > > im sorry i didnt realize it would just prefix

Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem.

2022-02-06 Thread Alex fxmbsw7 Ratchev
im sorry i didnt realize it would just prefix to null byte, which uses nullbyte, so it wont work cheers On Sun, Feb 6, 2022 at 11:11 PM Alex fxmbsw7 Ratchev wrote: > > i just have a small question here > the dropping of null bytes is no friend of me and i understand you're > there to skip it inst

Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem.

2022-02-06 Thread Alex fxmbsw7 Ratchev
i just have a small question here the dropping of null bytes is no friend of me and i understand you're there to skip it instead of process, which results in null bytes gone which is not much of an use can't these \0 bytes be encoded at least when a utf8 locale is used as \u0 instead of dropping ?

Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem.

2022-02-06 Thread Chet Ramey
On 2/5/22 9:41 PM, L A Walsh wrote: That's debatable, BTW, as I was reminded of a similar passthrough of what one might call 'invalid input' w/o warning, resulting in code that worked in a specific circumstance to a change in bash issuing a warning that resulted in breaking code, that, at that p

Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem.

2022-02-06 Thread Frank Heckenbach
> On 2022/01/02 17:43, Frank Heckenbach wrote: > > > Why would you? Aren't you able to assess the severity of a bug > > yourself? Silent data corruption is certainly one of the most severe > > kind of bugs ... > --- > That's debatable, BTW, as I was reminded of a similar > passthrough of what one m

Re: Corrupted multibyte characters in command substitutions fixes may be worse than problem.

2022-02-05 Thread L A Walsh
On 2022/01/02 17:43, Frank Heckenbach wrote: Chet Ramey wrote: After all, we're talking about silent data corruption, and now I learn the bug is known for almost a year, the fix is known and still hasn't been released, not even as an official patch. If you use the number of bug repor

Re: Corrupted multibyte characters in command substitutions

2022-01-08 Thread Chet Ramey
On 1/7/22 8:00 PM, Ángel wrote: I haven't make my mind^W^W^W^W have looked at the other patches and this one (patch 14) seems the most critical. Patchset 9 as well if the crash happens in the systemd version shipped in Debian stable. All the patches fix problems that have resulted in incorrect

Re: Corrupted multibyte characters in command substitutions

2022-01-08 Thread Chet Ramey
On 1/7/22 7:21 PM, Ángel wrote: When you mentioned the environment in the previous mail I thought in the environment block (which you reset with env -i). As for the environment in general, yes, apparently there are more things that cause it to be even more random. The buffer is an auto variabl

Re: Corrupted multibyte characters in command substitutions

2022-01-07 Thread Ángel
On 2022-01-06 at 17:03 -0500, Chet Ramey wrote: > On 1/3/22 6:02 PM, Ángel wrote: > > > Or, an even simpler one (assuming a utf-8 locale, like almost > > everyone uses these days): > > $ printf "%511s\xc3\xa4" | ./bash -c 'a="$(echo a)"; d=$(cat); echo > > "$d"' | sed 's/^ *//' > > Ö� > > > > whe

Re: Corrupted multibyte characters in command substitutions

2022-01-07 Thread Ángel
On 2022-01-08 at 00:22 +0100, Frank Heckenbach wrote: > Ángel wrote: > > > I think that had you tested the devel branch instead of the last > > release, you could have skipped a lot of testing (but how would you > > have known? it's an easy thing to miss). > > https://savannah.gnu.org/patch/?10035

Re: Corrupted multibyte characters in command substitutions

2022-01-07 Thread Frank Heckenbach
Ángel wrote: > I think that had you tested the devel branch instead of the last > release, you could have skipped a lot of testing (but how would you > have known? it's an easy thing to miss). > https://savannah.gnu.org/patch/?10035 seems to have gone the "easy > fix", which you discarded to get a

Re: Corrupted multibyte characters in command substitutions

2022-01-06 Thread Chet Ramey
On 1/3/22 6:02 PM, Ángel wrote: Or, an even simpler one (assuming a utf-8 locale, like almost everyone uses these days): $ printf "%511s\xc3\xa4" | ./bash -c 'a="$(echo a)"; d=$(cat); echo "$d"' | sed 's/^ *//' Ö� where it should have output: ä Even with this reproducer, I was unable to get

Re: Corrupted multibyte characters in command substitutions

2022-01-03 Thread Ángel
Hello Frank I think that had you tested the devel branch instead of the last release, you could have skipped a lot of testing (but how would you have known? it's an easy thing to miss). https://savannah.gnu.org/patch/?10035 seems to have gone the "easy fix", which you discarded to get a more thoro

Re: Corrupted multibyte characters in command substitutions

2022-01-02 Thread Frank Heckenbach
Chet Ramey wrote: > > After all, we're talking about silent data corruption, and now I > > learn the bug is known for almost a year, the fix is known and still > > hasn't been released, not even as an official patch. > > If you use the number of bug reports as an indication of urgency, Why would

Re: Corrupted multibyte characters in command substitutions

2022-01-02 Thread Chet Ramey
On 1/2/22 1:38 PM, Frank Heckenbach wrote: Chet Ramey wrote: On 1/1/22 7:02 PM, Frank Heckenbach wrote: Thanks for the report. This is a pretty good in-depth analysis of the issue. This was fixed back in March, 2021 in the devel branch as a result of https://savannah.gnu.org/patch/?10035 (tho

Re: Corrupted multibyte characters in command substitutions

2022-01-02 Thread Frank Heckenbach
Chet Ramey wrote: > On 1/1/22 7:02 PM, Frank Heckenbach wrote: > > Thanks for the report. This is a pretty good in-depth analysis of the issue. > > This was fixed back in March, 2021 in the devel branch as a result of > https://savannah.gnu.org/patch/?10035 (though the fix is different from > you

Re: Corrupted multibyte characters in command substitutions

2022-01-02 Thread Chet Ramey
On 1/1/22 7:02 PM, Frank Heckenbach wrote: Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wall uname output: Linux mars 5.10.0-9-amd64 #1 SMP Deb

Corrupted multibyte characters in command substitutions

2022-01-01 Thread Frank Heckenbach
Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wall uname output: Linux mars 5.10.0-9-amd64 #1 SMP Debian 5.10.70-1 (2021-09-30) x86_64 GNU/Linux