a replacement sequence to null bytes i would find a solution to null bytes no i didnt understand these posts of these emails but i am just concerned about the null bytes not being dropped
On Sun, Feb 6, 2022 at 11:16 PM Alex fxmbsw7 Ratchev <fxmb...@gmail.com> wrote: > > im sorry i didnt realize it would just prefix to null byte, which uses > nullbyte, so it wont work > cheers > > On Sun, Feb 6, 2022 at 11:11 PM Alex fxmbsw7 Ratchev <fxmb...@gmail.com> > wrote: > > > > i just have a small question here > > the dropping of null bytes is no friend of me and i understand you're > > there to skip it instead of process, which results in null bytes gone > > which is not much of an use > > > > can't these \0 bytes be encoded at least when a utf8 locale is used as > > \u0 instead of dropping ? <the two utf 8 bytes> and a null, ... just > > prefix the utf 8 encoding chars to the null > > and they'd be safely maybe still here > > > > just asking.. > > > > On Sun, Feb 6, 2022 at 6:38 PM Chet Ramey <chet.ra...@case.edu> wrote: > > > > > > On 2/5/22 9:41 PM, L A Walsh wrote: > > > > > > > That's debatable, BTW, as I was reminded of a similar > > > > passthrough of what one might call 'invalid input' w/o warning, > > > > resulting in code that worked in a specific circumstance to a change > > > > in bash issuing a warning that resulted in breaking code, that, at that > > > > point, worked as expected. > > > > > > Memory is a tricky thing. This statement -- you've made it twice -- got me > > > wondering what you might be referring to, so I went digging. > > > > > > > > > > Specifically, it involved reading a value typically in the range > > > > 50 <=x <=150 from an active file (like a value from /proc that varies > > > > based on OS internal values) where the data was stored in a > > > > quad, or Little-Endian DWORD value, so the value was in the the > > > > 2 least significant bytes with the most significant bytes following > > > > (in a higher position) in memory, like: > > > > Byte# => 00 01 02 03, for value 100 decimal: > > > > hex => 64 00 00 00 > > > > > > > > The working code expected to see 0x64 followed by 0x00 which it > > > > used as string terminator. > > > > > Chet "fixed" this silent use of 0x00 as a string terminator to no longer > > > > ignore it, but have bash issue a warning message, which caused the > > > > "read < fn" to fail and return 0 instead of the ascii character 'd', > > > > which > > > > the program had interpret as the DPI value of the user's screen. > > > > > > So it seems like you've conflated two different things. The first is the > > > command substitution warning about dropping NULL bytes from 2016: > > > > > > https://lists.gnu.org/archive/html/bug-bash/2016-09/msg00015.html > > > > > > which I talked about a couple of days ago: > > > > > > https://lists.gnu.org/archive/html/bug-bash/2022-02/msg00054.html > > > > > > The second is a change from back in 2011 (bash-4.2 days) that changed bash > > > to drop NULL bytes in the read builtin: > > > > > > https://lists.gnu.org/archive/html/bug-bash/2011-11/msg00136.html > > > > > > One of my messages in that thread contains a quickie survey of other > > > shells' behavior here. The change is in line with what other shells do. > > > > > > > > > > It took some debugging and hack arounds to find another way to access > > > > the data. So what some might have called silent data corruption because > > > > bash silently passed through the nul terminated datum as a string > > > > terminator, my program took as logical behavior. I complained about > > > > the change, > > > > > > Where? Since this is the opposite of what happened in the command > > > substitution case, I'm assuming you mean the read change from 2011. You > > > didn't participate in the original discussion, and I'm just not inclined > > > to go digging around the archives for it. > > > > > > > remarking that if bash was going to sanitize returned values > > > > (in that case checking for what should have been an ascii value with NUL > > > > not being in the allowed value of string characters), that bash might > > > > also be saddled with checking for invalid Unicode sequences and warning > > > > about > > > > them as well, regardless of the source of the corruption, some programs > > > > might expect to get a raw byte sequence rather than some encoded form > > > > with the difference in interpretation causing noticeable bugs. > > > > > > You might actually have said something like this at some point. > > > > > > I'd prefer to think your memory has conflated these two things, and that > > > this is how you remember it. That's better than the alternative. > > > > > > -- > > > ``The lyf so short, the craft so long to lerne.'' - Chaucer > > > ``Ars longa, vita brevis'' - Hippocrates > > > Chet Ramey, UTech, CWRU c...@case.edu http://tiswww.cwru.edu/~chet/ > > >