bash leaks the old var when using =~ in a function with local BASH_REMATCH

2022-05-28 Thread Emanuele Torre
Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS: -march=x86-64 -mtune=generic -O2 -pipe -fno-plt
-DDEFAULT_PATH_VALUE='/usr/local/sbin:/usr/local/bin:/usr/bin'
-DSTANDARD_UTILS_PATH='/usr/bin' -DSYS_BASHRC='/etc/bash.bashrc'
-DSYS_BASH_LOGOUT='/etc/bash.bash_logout'
-DNON_INTERACTIVE_LOGIN_SHELLS
uname output: Linux t420 5.15.41-1-lts #1 SMP Wed, 18 May 2022
13:37:06 + x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 5.1
Patch Level: 16
Release Status: release

Description:
If `[[ $str =~ $re ]]' is executed from a function in which
`BASH_REMATCH' is local, bash will "leak" the old *global*
`BASH_REMATCH' variable.

This happens because in `sh_regmatch()', bash calls these two
functions:

unbind_variable_noref ("BASH_REMATCH");
rematch = make_new_array_variable ("BASH_REMATCH");

`unbind_variable_noref()' will unbind and `free()' the first
variable it can find named "BASH_REMATCH" (giving priority to
local variables).

While "BASH_REMATCH" will add a new variable named
"BASH_REMATCH" to the global variables.

Since the old BASH_REMATCH variable was not removed, the old
variable will not be readable until the new one is removed
(using `unset -v BASH_REMATCH').

Repeat-By:
bash-5.1$ x=
bash-5.1$ a () [[ $x =~ .* ]]
bash-5.1$ (ulimit -v 5000; for ((i=0;i<8000;++i)); do a; done)
bash-5.1$ a () { local BASH_REMATCH; [[ $x =~ .* ]] ;}
bash-5.1$ (ulimit -v 5000; for ((i=0;i<8000;++i)); do a; done)
bash: xmalloc: cannot allocate 32 bytes

bash-5.1$ a () { local BASH_REMATCH; [[ $1 =~ .* ]] ;}
bash-5.1$ declare -p BASH_REMATCH
bash: declare: BASH_REMATCH: not found
bash-5.1$ a abc; declare -p BASH_REMATCH
declare -a BASH_REMATCH=([0]="abc")
bash-5.1$ a xyz; declare -p BASH_REMATCH
declare -a BASH_REMATCH=([0]="xyz")
bash-5.1$ a hello; declare -p BASH_REMATCH
declare -a BASH_REMATCH=([0]="hello")
bash-5.1$ unset -v BASH_REMATCH; declare -p BASH_REMATCH
declare -a BASH_REMATCH=([0]="xyz")
bash-5.1$ unset -v BASH_REMATCH; declare -p BASH_REMATCH
declare -a BASH_REMATCH=([0]="abc")
bash-5.1$ unset -v BASH_REMATCH; declare -p BASH_REMATCH
bash: declare: BASH_REMATCH: not found

This also occurs on the devel branch.

Fix:
The obvious fix is to use, instead of `unbind_variable_noref()',
a similar function that uses `global_variables' instead of
`shell_variables'.

That will remove the "variable leak", but it is still not great:
declaring a local `BASH_REMATCH' makes it impossible to access
the matches of `[[ $str =~ $re ]]' because bash will set the
global `BASH_REMATCH' instead of the local one, and
`"${BASH_REMATCH[@]}"' will expand to local `BASH_REMATCH'.

I think allowing `BASH_REMATCH' to be local-ised should be
considered: it would be nice. (Also, it's a little confusing
that `MAPFILE', `REPLY', `COPROC', etc. can be localised, but
`BASH_REMATCH' cannot.)

bash will currently (once the unbind part is fixed) try to
remove the global `BASH_REMATCH' and replace it with a brand new
array variable that contains the matches.

It could instead replace the local `BASH_REMATCH' variable with
a new local array variable (if a local `BASH_REMATCH' variable
of any type was present.)

I am not sure if bash has any specific reason to use this
technique instead of just using `find_or_make_array_variable()'
like other features in bash that use arrays do.

I think bash could just make `[[ $str =~ $re ]]' use
`find_or_make_array_variable()' like other bash features that
use arrays do; If the variable that already exists has
incompatible attributes (i.e. -A and -r) it could just print an
error message (while still returning 0/1 depending on the result
of the match, BASH_REMATCH not being settable should not
influence the exit status of `[[ $str =~ $re ]]'), or simply not
set BASH_REMATCH silently.
This would also allow to use attributes like -l, -u with
BASH_REMATCH (`declare -l BASH_REMATCH') which may be useful.



Re: Unfortunate error message for invalid executable

2022-05-28 Thread AA via Bug reports for the GNU Bourne Again SHell



I'm sure I'm not the first person to want to have a long philosophical 
conversation with the engineer that put the bolt I need to to reach in 
order to fix my car, in the place that requires me to disassemble 20 
other unrelated things. Nor am I likely to be the first person to want 
to reclaim the time wasted by such choices.


... but here goes:

While I understand all of these arguments, they seem to me to be 
inappropriately brushing off the issue based on highly technical and 
simultaneously highly user unfriendly reasoning. Bash, in the end, is a 
user space tool that is directly aimed at interfacing humans to the 
machine. It is, after all a *shell*.


In other words, for every provably accurate reason given below, there 
exists an argument that bash (when interpreting the error code returned) 
could do more. It could:


* Not print only the error string associated with ENONENT given that it 
is well known that error code may not in fact apply specifically to the 
file the user asked bash to execute. E.g., it could indicate this well 
known fact.
* Apply a reasonable set of heuristics to determine if the error 
returned by the kernel / execve() call makes direct sense for the path 
passed to the call, and issue a helpful message


Maybe the concern is that any additional calls (such as checking for 
path existence) may have unintended consequences. I'm not qualified to 
say definitively, but that seems unlikely.


Therefore, IMHO it is very hard to argue with the fact that the file 
passed to the kernel does in fact exist and therefore that ENOENT is 
provably false *for the path with which the user is directly 
interacting*. It seems therefore valid that, irrespective of 
kernel/distribution/etc/etc if ENOENT is returned when executing a path 
that does in fact exist, bash could print something more than the error 
string expansion of ENOENT (whether being obtuse about it is an 
anachronistic unix-ism or not).


I.e., something like "I'm not sure what's going on, but your path 
definitely exists, yet the kernel says otherwise."


... something like fprintf(STDERR,"No such file or directory while 
attempting to execute %s (it exists, but cannot be executed)",path);


All the best and thanks for the insights!
A.

PS: None of the foregoing is offered with anything but a smile.

On 5/26/22 7:39 PM, Dale R. Worley wrote:

Chet Ramey  writes:

On 5/26/22 2:27 PM, AA via Bug reports for the GNU Bourne Again SHell wrote:

When a user attempts to execute an executable that is not
recognized as an executable by the system, the generated error is "No such
file or directory"

In this case, it's the errno value returned from execve(2), and it's
exactly correct, at least from the kernel's perspective.

It's not that the executable isn't recognized or in an invalid format, in
which case execve would return ENOEXEC. It's that the ELF header specifies
a particular interpreter to run on the file (e.g., ld.so), and that file is
the one that is not found (ENOENT).

This parallels the annoying Unixism that if you attempt to execute a
file that is marked executable that has a #! interpreter specification,
but the specified interpreter does not exist, the generated error is "No
such file or directory".  It would be nice if the kernel generated a
separate errno for "a supporting executable for this executable file
does not exist" but nobody's bothered to do that.

Dale


Re: Unfortunate error message for invalid executable

2022-05-28 Thread Dale R. Worley
AA via Bug reports for the GNU Bourne Again SHell 
writes:
> I.e., something like "I'm not sure what's going on, but your path 
> definitely exists, yet the kernel says otherwise."
>
> ... something like fprintf(STDERR,"No such file or directory while 
> attempting to execute %s (it exists, but cannot be executed)",path);

Historically, the way to get something like this to happen is to design
and code the modification that does it.  That has the advantage that you
have to bite the bullet and instead of just describing the general idea,
decide on a concrete implementation.  That sounds obvious, but there is
a long history of ideas in software that *sound good* but for which
there is no implementation that sucks less than the problem the idea
seeks to solve.

Dale



Re: Unfortunate error message for invalid executable

2022-05-28 Thread Andrew Athan via Bug reports for the GNU Bourne Again SHell
I’ll give it a shot.

A.

> On May 28, 2022, at 4:35 PM, Dale R. Worley  wrote:
> 
> AA via Bug reports for the GNU Bourne Again SHell 
> writes:
>> I.e., something like "I'm not sure what's going on, but your path 
>> definitely exists, yet the kernel says otherwise."
>> 
>> ... something like fprintf(STDERR,"No such file or directory while 
>> attempting to execute %s (it exists, but cannot be executed)",path);
> 
> Historically, the way to get something like this to happen is to design
> and code the modification that does it.  That has the advantage that you
> have to bite the bullet and instead of just describing the general idea,
> decide on a concrete implementation.  That sounds obvious, but there is
> a long history of ideas in software that *sound good* but for which
> there is no implementation that sucks less than the problem the idea
> seeks to solve.
> 
> Dale