indirect variable behavior breakage
From: cba...@rajant.com To: bug-bash@gnu.org Subject: indirect variable behavior breakage Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-redhat-linux-gnu' -DCONF_VENDOR='redhat' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -I. -I. -I./include -I./lib -D_GNU_SOURCE -DRECYCLES_PIDS -DDEFAULT_PATH_VALUE='/usr/local/bin:/usr/bin' -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -mtune=generic uname output: Linux crucible.rajant.com 4.13.16-100.fc25.x86_64 #1 SMP Mon Nov 27 19:52:46 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Machine Type: x86_64-redhat-linux-gnu Bash Version: 4.3 (works) Patch Level: 43 # Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-redhat-linux-gnu' -DCONF_VENDOR='redhat' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -I. -I. -I./include -I./lib -D_GNU_SOURCE -DRECYCLES_PIDS -DDEFAULT_PATH_VALUE='/usr/local/bin:/usr/bin' -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -mtune=generic -Wno-parentheses -Wno-format-security uname output: Linux 807847456c29 4.15.3-300.fc27.x86_64 #1 SMP Tue Feb 13 17:02:01 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux Machine Type: x86_64-redhat-linux-gnu Bash Version: 4.4 (does not work) Patch Level: 12 Release Status: release Description: a simple function that is used to test if a var is indirect or not and return the value whether indirect or simply a string is no longer working as expected on newer versions of bash. Repeat-By: This function works fine on this version (4.3.43) of bash (and all previous versions I have access to, e.g. 4.2.46). It is broken, that I am aware of, in 4.4.12 and 4.4.19, and possibly other versions as well. function ivar() { echo -n "${!1:-${1}}"; } This will take in a word as $1, and if it is a ref, will echo the value of ref, but if it is not, will simply echo $1. This function cares not about the string handed in, it rejects the indirection silently, and IMHO correctly, regardless of the characters in that string. I have found this function to be incredibly handy, and it's in extensive use. It's simplicity and usefulness are very much relied upon. Now, if the string starts with a number, or has a dash (possibly other chars as well), it throws a syntax error. It's not clear to me why it now feels the need to do this. Usage: # in a working bash version: $ ping=pong $ ivar "${ping}" pong <-result $ ivar 1.0.0 1.0.0 <-result $ ivar m-4 m-4 <-result $ declare -A myhash=( [foo]=bar [bar]=baz [baz]=foo ) $ ivar "myhash[@]" baz foo bar <-result $ ivar "myhash[foo]" bar <-result # in a newer, non-working bash version: $ ivar 1.0.0 bash: 1.0.0: bad substitution <-result $ ivar m-4 bash: m-4: bad substitution <-result Fix: The fix is to please restore the previous behavior. I'm assuming it's that no one noticed this use-case breakage, or they would not have made the change. I think what's happened is before, bash looked into it's environment to see if the string matched an already defined variable, and if it did, it returned the value of the match. I think a change was introduced that says, hey I'm going to see if this is a validly formatted variable name first, and if not throw an error. The prior way also did that, but it did it when the actual variable was set, not when it was being looked up. Obviously, it won't find a match if the name is invalid, so this may have been seen as an optimization. But in this case, it would simply be better to return nothing and do nothing, not throw an error. The name does not NEED to be valid here. Anyway, that my guess :) Thanks for looking into it, Regards, Christopher Barry Rajant
Re: indirect variable behavior breakage
On Wed, 7 Mar 2018 09:51:59 -0500 Chet Ramey wrote: > On 3/6/18 11:38 PM, christopher barry wrote: > > > Bash Version: 4.4 (does not work) > > Patch Level: 12 > > Release Status: release > > > > > > Description: > > > > a simple function that is used to test if a var is indirect or not > > and return the value whether indirect or simply a string is no > > longer working as expected on newer versions of bash. > > > > Repeat-By: > > > > This function works fine on this version (4.3.43) of bash (and > > all previous versions I have access to, e.g. 4.2.46). It is broken, > > that I am aware of, in 4.4.12 and 4.4.19, and possibly other > > versions as well. > > > > function ivar() { echo -n "${!1:-${1}}"; } > > > > This will take in a word as $1, and if it is a ref, will echo the > > value of ref, but if it is not, will simply echo $1. This function > > cares not about the string handed in, it rejects the indirection > > silently, and IMHO correctly, regardless of the characters in that > > string. I have found this function to be incredibly handy, and it's > > in extensive use. It's simplicity and usefulness are very much > > relied upon. > > > > Now, if the string starts with a number, or has a dash (possibly > > other chars as well), it throws a syntax error. It's not clear to > > me why it now feels the need to do this. > > The strings you're describing are invalid variable names. They would > generate variable expansion errors if used directly; why should they > not generate the same error when used as the value of an indirect > expansion? The two cases are supposed to be identical. > > These should all generate errors (and in bash-4.4, they generate > the same `bad substitution' error): > > echo "${-3:-${-3}}"; > echo ${-3} > > x=-3 > echo ${!x} > > function ivar() { echo -n "${!1:-${1}}"; } > ivar -3 > > It was a bug in bash-4.3 that the latter two cases didn't produce > errors, and, in the `!x' case, produced random output and occasional > core dumps. > > Chet Hi Chet, Thanks the quick reply, for bash and all the work you've put into it, and thanks for continuing to fix bugs too. Of course sanitizing the input before evaluation is always good, and your fix does just that by seeing an invalid var name and throwing an error. Finding it's invalid is good, throwing an error is possibly overkill. My point is, I already cannot: $ declare -3=myval bash: declare: -3: invalid option declare: usage: declare [-aAfFgilrtux] [-p] [name[=value] ...] $ declare wrong-name=100 bash: declare: `wrong-name=100': not a valid identifier $ declare 3var=something bash: declare: `3var=something': not a valid identifier and I cannot create any other variable that has an invalid name because bash correctly disallows it from ever occurring. So yes 'ivar -3' is very, very bad indeed, and was a nasty bug, and I absolutely agree that making sure the string is a validly formatted var name before attempting to evaluate it is the right approach. My only request is for bash to not throw an error when it realizes it's invalid. Returning 1 and no data seems like a reasonable response that fixes the prior bug, but will not break this use-case. Is there a reason this is an unreasonable request or an undesirable behavior I am not understanding? Thanks again, Christopher Barry Rajant
Re: indirect variable behavior breakage
On Wed, 7 Mar 2018 11:09:10 -0500 Greg Wooledge wrote: > On Wed, Mar 07, 2018 at 10:54:22AM -0500, christopher barry wrote: > > So yes 'ivar -3' is very, very bad indeed, and was a nasty bug, and > > I absolutely agree that making sure the string is a validly > > formatted var name before attempting to evaluate it is the right > > approach. > > > > My only request is for bash to not throw an error when it realizes > > it's invalid. Returning 1 and no data seems like a reasonable > > response that fixes the prior bug, but will not break this > > use-case. > > Just out of curiosity, what *is* this use case, and why are you not > using an associative array? > > It seems like you are trying to map strings of gibberish to nothing, > and strings of non-gibberish to values. An associative array does > that. I am in fact using this method with associative arrays. I have a default hash that is full of default values for a particular generic type of thing. Variations of this thing use many of the defaults, but also set some values uniquely. I could simply duplicate this hash everywhere, and edit the differences, but then changing any defaults would require a change everywhere. As an alternative, I created a template hash that all non-default things start out with. this template hash has all of it's values pointing to "default_hash[somekey]" new things that need to change their specific values simply override that and put in their string. Now, changes to actual values in default_hash are seen everywhere they are not overridden locally by the specific thing using it. so reading a hash value may be an indirect pointer to the default_hash or an overridden value set by the specific thing using the hash. ivar solves this nicely. But to get back on point, the bug in bash was nasty, and it's fantastic that it's fixed now, but throwing an error now, rather than simply returning 1 with no data is a bit like throwing the baby out with the bathwater. -C
Re: indirect variable behavior breakage
On Wed, 7 Mar 2018 11:45:13 -0500 christopher barry wrote: ===8<---snip > > I am in fact using this method with associative arrays. > > I have a default hash that is full of default values for a particular > generic type of thing. Variations of this thing use many of the > defaults, but also set some values uniquely. I could simply duplicate > this hash everywhere, and edit the differences, but then changing > any defaults would require a change everywhere. > > As an alternative, I created a template hash that all non-default > things start out with. > > this template hash has all of it's values pointing to > "default_hash[somekey]" > > new things that need to change their specific values simply override > that and put in their string. > > Now, changes to actual values in default_hash are seen everywhere they > are not overridden locally by the specific thing using it. > > so reading a hash value may be an indirect pointer to the default_hash > or an overridden value set by the specific thing using the hash. > > ivar solves this nicely. > > > But to get back on point, the bug in bash was nasty, and it's > fantastic that it's fixed now, but throwing an error now, rather than > simply returning 1 with no data is a bit like throwing the baby out > with the bathwater. > > > -C Chet, Does modifying this current behavior sound like anything you would be willing to entertain? Appreciate your consideration, Regards, Christopher Barry Rajant
Re: indirect variable behavior breakage
On Fri, 9 Mar 2018 11:29:35 -0500 Chet Ramey wrote: > On 3/7/18 3:20 PM, christopher barry wrote: > > On Wed, 7 Mar 2018 11:45:13 -0500 > > christopher barry wrote: > > > > ===8<---snip > > > >> > >> I am in fact using this method with associative arrays. > >> > >> I have a default hash that is full of default values for a > >> particular generic type of thing. Variations of this thing use > >> many of the defaults, but also set some values uniquely. I could > >> simply duplicate this hash everywhere, and edit the differences, > >> but then changing any defaults would require a change everywhere. > >> > >> As an alternative, I created a template hash that all non-default > >> things start out with. > >> > >> this template hash has all of it's values pointing to > >> "default_hash[somekey]" > >> > >> new things that need to change their specific values simply > >> override that and put in their string. > >> > >> Now, changes to actual values in default_hash are seen everywhere > >> they are not overridden locally by the specific thing using it. > >> > >> so reading a hash value may be an indirect pointer to the > >> default_hash or an overridden value set by the specific thing > >> using the hash. > >> > >> ivar solves this nicely. > >> > >> > >> But to get back on point, the bug in bash was nasty, and it's > >> fantastic that it's fixed now, but throwing an error now, rather > >> than simply returning 1 with no data is a bit like throwing the > >> baby out with the bathwater. > >> > >> > >> -C > > > > Chet, > > > > Does modifying this current behavior sound like anything you would > > be willing to entertain? > > I think that throwing an error when the value of a variable used for > indirection would not be accepted as valid (and throw an error) when > used directly is the right thing to do, and I will leave it as the > default behavior. > > I'm open to suggestions about how to allow alternate behavior: a shell > option, something controlled by the shell compatibility level, > something else? It seems like you're really interested in learning > whether a particular string is a valid shell variable and doing > something based on the result. You could check that directly -- the > pattern to describe a shell variable isn't that hard to write -- but > I get the sense that you'd rather not take the extra step. > > Chet > Hi Chet, It's not that I don't want to take any required steps, it was that it seemed logical to me to simply return 1 in that instance. When I discovered that this method worked the way I was hoping, I was kinda miffed at myself for never having noticed it earlier. Then I was shown it was not working on new versions. I had never hit the core dump issue before you pointed it out. Could the fact that that bug caused a core dump be influencing why you do not want to simply return 1 now? Maybe throwing an error now seems better because it was such a bad bug before? When assigning to a var, format checking absolutely should be done, and an error should be thrown if invalid. But if only checking, e.g. just reading a name, then returning 1 if it's an invalid name or does not exist seems like the smallest, most applicable hammer for the job at hand. I did actually spend some time (longer than I would like to admit) trying to craft a decent regex to make sure that every possible way of asking was covered, but it turns out to be a bit more complex that one may think at first blush. ^([#]*(_[[:digit:]]|[_[:alpha:]])[_[:alnum:]]*)(\[[_[:alnum:]*@]+\])$ testing against ${1}, ${BASH_REMATCH[0]} and ${BASH_REMATCH[1]} would mostly show the valid forms, but this regex was still not quite right. A colleague here at work came up with a beautiful 16oz ball-peen hammer approach, no regex required, as can be seen here: #--- function ivar() # Description: cascading indirect -> regular var expander { ( echo -n "${!1:-${1}}" ) 2>/dev/null || echo -n "${1}" } #--- it solves the problem, so that's what I'm using now, but it's obviously not as clean and simple as it was, and it can still core dump the sub-shell, which does suck, but it's non-fatal. If you can't see my point about not needing to throw an error on just a read, then I guess I'll just have to deal with it. Thanks, and maybe you will come to see my position at some point, never know :) Regards, -C