Re: Variables can’t contain NUL

George Wed, 20 Jun 2018 07:23:47 -0700

On Sun, 2018-05-20 at 04:56 +0200, Garreau, Alexandre wrote:
> On 2015-11-13 at 07:17, Greg Wooledge wrote:
> Actually in the most general case, where those output streams may
> contain NUL bytes, it requires two temp files, because you can't store
> arbitrary data streams in bash variables at all.
> 
> Why do bash variables use 0-terminated arrays instead of arrays structure
> with a length attribute?
>


This is a question that interests me for various reasons: I don't really favor 
the idea that a shell shouldn't be considered a "real" programming language or 
be held to that kind of standard. Though it is difficult to reconcile that with 
backward-compatibility and POSIX-compatibility a lot of the time.

Apart from the reasons already given: shells tend to assume some level of 
equivalence between facilities the shell language provides, and similar 
facilities the OS provides.

For instance, shell variables are generally assumed to work the same as OS 
environment variables. These days there are cases where the two diverge (shell 
variables support arrays and such, while environment variables do not) and so 
you can't "export" an array variable, for instance.

Encoding shell variables as length-prefixed arrays would create another such 
disparity: the underlying OS mechanisms for environment variables generally 
assume a NUL terminates an environment variable (for instance execve() or "man 
7 environ") - even if the environment could be (mis-?)used to carry data with 
NUL in it, the program receiving that data would have to follow the same 
convention for how to use it, or the data would effectively be lost.

NUL containment could be provided in shell variables (similar to how shell 
variables can provide arrays, etc. but can't "export" them) but then there's an 
additional problem, of what you can do with them. You can't provide a NUL as 
part of a command-line argument to an external command (because, like the 
environment, argv[] is by convention assumed to be NUL-terminated and the OS 
itself may enforce that assumption in some cases) - so you'd be pretty much 
limited to internal commands and shell functions - creating another disparity. 
"Disparities" aren't just theoretical problems or aesthetic blemishes, they 
turn into user frustration and bug reports. (As in "I put a NUL in a variable 
and it didn't work right")

Personally I do think some method of handling arbitrary binary data in the 
shell would be a welcome addition (and I think zsh provides that - don't 
remember if ksh does) - it's just hard to resolve against some of the other 
underlying assumptions of the shell.

Re: Variables can’t contain NUL

Reply via email to