bash leaks the old var when using =~ in a function with local BASH_REMATCH
Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -march=x86-64 -mtune=generic -O2 -pipe -fno-plt -DDEFAULT_PATH_VALUE='/usr/local/sbin:/usr/local/bin:/usr/bin' -DSTANDARD_UTILS_PATH='/usr/bin' -DSYS_BASHRC='/etc/bash.bashrc' -DSYS_BASH_LOGOUT='/etc/bash.bash_logout' -DNON_INTERACTIVE_LOGIN_SHELLS uname output: Linux t420 5.15.41-1-lts #1 SMP Wed, 18 May 2022 13:37:06 + x86_64 GNU/Linux Machine Type: x86_64-pc-linux-gnu Bash Version: 5.1 Patch Level: 16 Release Status: release Description: If `[[ $str =~ $re ]]' is executed from a function in which `BASH_REMATCH' is local, bash will "leak" the old *global* `BASH_REMATCH' variable. This happens because in `sh_regmatch()', bash calls these two functions: unbind_variable_noref ("BASH_REMATCH"); rematch = make_new_array_variable ("BASH_REMATCH"); `unbind_variable_noref()' will unbind and `free()' the first variable it can find named "BASH_REMATCH" (giving priority to local variables). While "BASH_REMATCH" will add a new variable named "BASH_REMATCH" to the global variables. Since the old BASH_REMATCH variable was not removed, the old variable will not be readable until the new one is removed (using `unset -v BASH_REMATCH'). Repeat-By: bash-5.1$ x= bash-5.1$ a () [[ $x =~ .* ]] bash-5.1$ (ulimit -v 5000; for ((i=0;i<8000;++i)); do a; done) bash-5.1$ a () { local BASH_REMATCH; [[ $x =~ .* ]] ;} bash-5.1$ (ulimit -v 5000; for ((i=0;i<8000;++i)); do a; done) bash: xmalloc: cannot allocate 32 bytes bash-5.1$ a () { local BASH_REMATCH; [[ $1 =~ .* ]] ;} bash-5.1$ declare -p BASH_REMATCH bash: declare: BASH_REMATCH: not found bash-5.1$ a abc; declare -p BASH_REMATCH declare -a BASH_REMATCH=([0]="abc") bash-5.1$ a xyz; declare -p BASH_REMATCH declare -a BASH_REMATCH=([0]="xyz") bash-5.1$ a hello; declare -p BASH_REMATCH declare -a BASH_REMATCH=([0]="hello") bash-5.1$ unset -v BASH_REMATCH; declare -p BASH_REMATCH declare -a BASH_REMATCH=([0]="xyz") bash-5.1$ unset -v BASH_REMATCH; declare -p BASH_REMATCH declare -a BASH_REMATCH=([0]="abc") bash-5.1$ unset -v BASH_REMATCH; declare -p BASH_REMATCH bash: declare: BASH_REMATCH: not found This also occurs on the devel branch. Fix: The obvious fix is to use, instead of `unbind_variable_noref()', a similar function that uses `global_variables' instead of `shell_variables'. That will remove the "variable leak", but it is still not great: declaring a local `BASH_REMATCH' makes it impossible to access the matches of `[[ $str =~ $re ]]' because bash will set the global `BASH_REMATCH' instead of the local one, and `"${BASH_REMATCH[@]}"' will expand to local `BASH_REMATCH'. I think allowing `BASH_REMATCH' to be local-ised should be considered: it would be nice. (Also, it's a little confusing that `MAPFILE', `REPLY', `COPROC', etc. can be localised, but `BASH_REMATCH' cannot.) bash will currently (once the unbind part is fixed) try to remove the global `BASH_REMATCH' and replace it with a brand new array variable that contains the matches. It could instead replace the local `BASH_REMATCH' variable with a new local array variable (if a local `BASH_REMATCH' variable of any type was present.) I am not sure if bash has any specific reason to use this technique instead of just using `find_or_make_array_variable()' like other features in bash that use arrays do. I think bash could just make `[[ $str =~ $re ]]' use `find_or_make_array_variable()' like other bash features that use arrays do; If the variable that already exists has incompatible attributes (i.e. -A and -r) it could just print an error message (while still returning 0/1 depending on the result of the match, BASH_REMATCH not being settable should not influence the exit status of `[[ $str =~ $re ]]'), or simply not set BASH_REMATCH silently. This would also allow to use attributes like -l, -u with BASH_REMATCH (`declare -l BASH_REMATCH') which may be useful.
Re: Unfortunate error message for invalid executable
I'm sure I'm not the first person to want to have a long philosophical conversation with the engineer that put the bolt I need to to reach in order to fix my car, in the place that requires me to disassemble 20 other unrelated things. Nor am I likely to be the first person to want to reclaim the time wasted by such choices. ... but here goes: While I understand all of these arguments, they seem to me to be inappropriately brushing off the issue based on highly technical and simultaneously highly user unfriendly reasoning. Bash, in the end, is a user space tool that is directly aimed at interfacing humans to the machine. It is, after all a *shell*. In other words, for every provably accurate reason given below, there exists an argument that bash (when interpreting the error code returned) could do more. It could: * Not print only the error string associated with ENONENT given that it is well known that error code may not in fact apply specifically to the file the user asked bash to execute. E.g., it could indicate this well known fact. * Apply a reasonable set of heuristics to determine if the error returned by the kernel / execve() call makes direct sense for the path passed to the call, and issue a helpful message Maybe the concern is that any additional calls (such as checking for path existence) may have unintended consequences. I'm not qualified to say definitively, but that seems unlikely. Therefore, IMHO it is very hard to argue with the fact that the file passed to the kernel does in fact exist and therefore that ENOENT is provably false *for the path with which the user is directly interacting*. It seems therefore valid that, irrespective of kernel/distribution/etc/etc if ENOENT is returned when executing a path that does in fact exist, bash could print something more than the error string expansion of ENOENT (whether being obtuse about it is an anachronistic unix-ism or not). I.e., something like "I'm not sure what's going on, but your path definitely exists, yet the kernel says otherwise." ... something like fprintf(STDERR,"No such file or directory while attempting to execute %s (it exists, but cannot be executed)",path); All the best and thanks for the insights! A. PS: None of the foregoing is offered with anything but a smile. On 5/26/22 7:39 PM, Dale R. Worley wrote: Chet Ramey writes: On 5/26/22 2:27 PM, AA via Bug reports for the GNU Bourne Again SHell wrote: When a user attempts to execute an executable that is not recognized as an executable by the system, the generated error is "No such file or directory" In this case, it's the errno value returned from execve(2), and it's exactly correct, at least from the kernel's perspective. It's not that the executable isn't recognized or in an invalid format, in which case execve would return ENOEXEC. It's that the ELF header specifies a particular interpreter to run on the file (e.g., ld.so), and that file is the one that is not found (ENOENT). This parallels the annoying Unixism that if you attempt to execute a file that is marked executable that has a #! interpreter specification, but the specified interpreter does not exist, the generated error is "No such file or directory". It would be nice if the kernel generated a separate errno for "a supporting executable for this executable file does not exist" but nobody's bothered to do that. Dale
Re: Unfortunate error message for invalid executable
AA via Bug reports for the GNU Bourne Again SHell writes: > I.e., something like "I'm not sure what's going on, but your path > definitely exists, yet the kernel says otherwise." > > ... something like fprintf(STDERR,"No such file or directory while > attempting to execute %s (it exists, but cannot be executed)",path); Historically, the way to get something like this to happen is to design and code the modification that does it. That has the advantage that you have to bite the bullet and instead of just describing the general idea, decide on a concrete implementation. That sounds obvious, but there is a long history of ideas in software that *sound good* but for which there is no implementation that sucks less than the problem the idea seeks to solve. Dale
Re: Unfortunate error message for invalid executable
I’ll give it a shot. A. > On May 28, 2022, at 4:35 PM, Dale R. Worley wrote: > > AA via Bug reports for the GNU Bourne Again SHell > writes: >> I.e., something like "I'm not sure what's going on, but your path >> definitely exists, yet the kernel says otherwise." >> >> ... something like fprintf(STDERR,"No such file or directory while >> attempting to execute %s (it exists, but cannot be executed)",path); > > Historically, the way to get something like this to happen is to design > and code the modification that does it. That has the advantage that you > have to bite the bullet and instead of just describing the general idea, > decide on a concrete implementation. That sounds obvious, but there is > a long history of ideas in software that *sound good* but for which > there is no implementation that sucks less than the problem the idea > seeks to solve. > > Dale