EXIT trap is not executed after an exec failure in a non-interactive shell

2021-09-30 Thread Mark March
If execfail is set, a failed exec does not cause a non-interactive shell to 
exit, but it seems to reset the EXIT trap:

cat <<'EOF' | bash
shopt -s execfail
trap "echo exiting..." EXIT
exec ~/does-not-exist
echo "exec failed in bash-$BASH_VERSION"
exit 1
EOF

Output:

bash: line 3: /home/march/does-not-exist: No such file or directory
exec failed in bash-5.0.17(1)-release

The "exiting..." line is missing. If you comment out exec ~/does-not-exist, 
"exiting..." will be printed as expected.
I get this under 5.1.8 as well, built with gcc 9.3 This is on Ubuntu 20 on 
x86_64.

-Mark



Re: EXIT trap is not executed after an exec failure in a non-interactive shell

2021-09-30 Thread Mark March
>then no more continued code

man bash(1) says the following for the execfail option:
  execfail
  If set, a non-interactive shell will not exit if it can‐
  not  execute  the  file  specified as an argument to the
  exec builtin command.  An  interactive  shell  does  not
  exit if exec fails.

It says nothing about the traps getting reset. In my example the script clearly 
continues to execute after the failed exec, as it should (since execfail is 
set).

-Mark



On Thursday, September 30, 2021, 04:47:13 PM PDT, Alex fxmbsw7 Ratchev 
 wrote: 





i think its not a bug but normal runtime
as ive understood, exec 'replaces' itself with , then no more 
continued code
well error checking could maybe be better


On Fri, Oct 1, 2021, 01:25 Mark March  wrote:
> If execfail is set, a failed exec does not cause a non-interactive shell to 
> exit, but it seems to reset the EXIT trap:
> 
> cat <<'EOF' | bash
> shopt -s execfail
> trap "echo exiting..." EXIT
> exec ~/does-not-exist
> echo "exec failed in bash-$BASH_VERSION"
> exit 1
> EOF
> 
> Output:
> 
> bash: line 3: /home/march/does-not-exist: No such file or directory
> exec failed in bash-5.0.17(1)-release
> 
> The "exiting..." line is missing. If you comment out exec ~/does-not-exist, 
> "exiting..." will be printed as expected.
> I get this under 5.1.8 as well, built with gcc 9.3 This is on Ubuntu 20 on 
> x86_64.
> 
> -Mark
> 
> 




Re: EXIT trap is not executed after an exec failure in a non-interactive shell

2021-10-01 Thread Mark March
Ok, thank you for clarifying. There is nothing in the documentation about this 
behavior as far as I can tell. I would suggest adding a line about traps 
getting reset after a failed exec to the paragraph on 'execfail'.

 -Mark

On Friday, October 1, 2021, 07:02:34 AM PDT, Chet Ramey  
wrote: 





On 9/30/21 7:24 PM, Mark March wrote:

> If execfail is set, a failed exec does not cause a non-interactive shell to 
> exit, but it seems to reset the EXIT trap:


Yes. When the shell runs `exec', it assumes the execed program will overlay
the shell process. To make that happen transparently, it has to undo things
it has done: it ends job control and restores the original process groups,
it restores the signal dispositions that it got from its parent, and it
clears other shell state like the EXIT trap.

If the exec fails, it tries to restore some things as well as it can, but
it doesn't try to restore any traps.


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
        ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    c...@case.edu    http://tiswww.cwru.edu/~chet/





Re: EXIT trap is not executed after an exec failure in a non-interactive shell

2021-10-08 Thread Mark March
>I'll look at changing that

Thank you. I agree, making traps behave the same after an exec failure as after 
any other failed builtin is a better course of action. It will make the trap 
facility stronger. execve failures are not that uncommon, and are hard to 
predict in advance. execve(2) man page lists 24 failure conditions! Since they 
can't predict exec failures, scripts that use traps and must handle exec 
failures will have to save the traps before pretty much every call to exec that 
takes a command name. This is quite a bit of work to put on bash users. On the 
other hand, not doing this will lead to subtle bugs where cleanup code will 
suddenly not run, or processes unexpectedly catch signals that have been 
previously blocked.

-Mark



On Friday, October 8, 2021, 08:02:31 AM PDT, Chet Ramey  
wrote: 





On 10/1/21 2:16 PM, Mark March wrote:
> Ok, thank you for clarifying. There is nothing in the documentation about 
> this behavior as far as I can tell. I would suggest adding a line about traps 
> getting reset after a failed exec to the paragraph on 'execfail'.

I think it will be a cleaner fix, and more intuitive, to make sure the
traps are preserved across a failed `exec'. I'll look at changing that

behavior.

Chet

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
        ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    c...@case.edu    http://tiswww.cwru.edu/~chet/




${y@a} on an empty array y fails with 'y: unbound variable' if run with set -u

2021-10-25 Thread Mark March
If -u is on and you declare a simple or associative array 'y' and set it empty, 
then ${y@a} will fail with 'unbound variable'.
I was able to repro this under 5.1.0(1)-release and 5.1.8. 5.0.17(1)-release 
does not seem to be affected.

The code to reproduce (output lines begin with #>):

echo $BASH_VERSION
#> 5.1.0(1)-release
set -u
declare -a y=()
echo ${y@a}
#> bash: y: unbound variable
declare -p y
#> declare -a y=()
echo ${y[@]}
#>
set +u
echo ${y@a}
#> a

As you can see, turning off -u makes ${y@a} work correctly. 
I wonder if this is a side-effect of the fix that you described in item (n) of 
the most recent change log:

n. Fixed a bug that caused ${foo@a} to treat foo as an unset variable if it was 
an array without a value for subscript 0/"0" but had other set elements

Thanks,
-Mark



Re: ${y@a} on an empty array y fails with 'y: unbound variable' if run with set -u

2021-10-26 Thread Mark March
>"An array variable is considered set if a subscript has been assigned  a 
>value."

My bad, I missed that. However, there is an argument to be made that ${foo@a} 
and ${foo@A} transforms should be exempt from the nounset option if foo is an 
unset array. An attempt to obtain the value (or something derived from the 
value, such as its length) of an unset variable is clearly a bug in one's code. 
But ${foo@a} and ${foo@A} expand to the _attributes_ of the variable, not its 
value. It's fine to consider an array unset if it has no indexes/key, but even 
though it is unset such an array still has attributes. Bailing out on an 
attempt to access attributes (which clearly exist and can be obtained with 
declare -p at the cost of forking a subshell) just because the variable has no 
value seems counterintuitive and counter to the intended purpose of the nounset 
option, which is to help the programmer catch bugs in her code. 

>That was the result of a long discussion:
>
>https://lists.gnu.org/archive/html/bug-bash/2020-02/msg00050.html

It seems that in that thread you agreed to take steps towards making ${foo@a} 
succeed when declare -p foo would succeed. I would argue that this should also 
work under nounset, but I see how this may be a tough call since ${foo@a} is 
technically a parameter expansion, although of a new kind -- to a string 
derived from something other than the value of the parameter. I will just put 
[[ -o unset ]] && { set +u; ... set -u; } around my [[ ${foo@a} =~ A ]] for now.

Thanks again for your explanation and context.
-Mark



On Tuesday, October 26, 2021, 07:02:59 AM PDT, Chet Ramey  
wrote: 





On 10/25/21 8:24 PM, Mark March wrote:
> If -u is on and you declare a simple or associative array 'y' and set it 
> empty, then ${y@a} will fail with 'unbound variable'.

It really is unset:

"An array variable is considered set if a subscript has been assigned  a
value."

> I was able to repro this under 5.1.0(1)-release and 5.1.8. 5.0.17(1)-release 
> does not seem to be affected.

Bash-5.1 fixed some bugs in this area.


> The code to reproduce (output lines begin with #>):
> 
> echo $BASH_VERSION
> #> 5.1.0(1)-release
> set -u
> declare -a y=()
> echo ${y@a}
> #> bash: y: unbound variable

Yes, because it's unset, but see below.

> declare -p y
> #> declare -a y=()
> echo ${y[@]}
> #>

The `@' and `*' subscripts are exempted from the nounset option to parallel
the behavior of `$@' and `$*' when there are no positional parameters (the
latter is specified by POSIX).

> set +u
> echo ${y@a}
> #> a
> 
> As you can see, turning off -u makes ${y@a} work correctly. 

If the `nonunset' option is not enabled, there is a special case for the
`a' transform so it will print the attributes of an unset array variable.

That was the result of a long discussion:

https://lists.gnu.org/archive/html/bug-bash/2020-02/msg00050.html



> I wonder if this is a side-effect of the fix that you described in item (n) 
> of the most recent change log:
> 
> n. Fixed a bug that caused ${foo@a} to treat foo as an unset variable if it 
> was an array without a value for subscript 0/"0" but had other set elements


This is not relevant, there are no set elements.

Chet
-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
        ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    c...@case.edu    http://tiswww.cwru.edu/~chet/




read -t0 may report that input is available where none is possible

2021-11-20 Thread Mark March
 true | { sleep 1 ; read -rt0 ; echo $?; }

outputs 0. There can clearly be no data on the stdin of the second process in 
the pipeline.

bash(1) has this to say about read -t:

"If timeout is 0, read  returns  immediately, without trying to read any data. 
The exit status is 0 if input is available on the specified file descriptor, 
non-zero otherwise."

Either bash considers EOF on the pipe "input", or this is a bug.

This was discussed 2 years ago 
https://lists.gnu.org/archive/html/bug-bash/2019-12/msg00076.html. Chet's 
conclusion then was that the detection of the presence of data on an fd is 
inherently racy. That is true, but still read -rt0 should probably not report 
"input is available" where there can be none. If the current behavior is 
intentional, consider making it clear in the man page. 

The implementation of input_avail() in Bash 5.x appears to just call select() 
on the fd, but select() reports an fd readable also on EOF. The relevant quote 
from man select(2):

"The file descriptors listed in readfds will be watched to see if characters 
become available for reading (more precisely, to see if a read will not block; 
in particular, a file descriptor is also ready on end-of-file)."

-Mark



a nameref may resolve to different variables in assignment and parameter expansion

2022-01-06 Thread Mark March
It appears that the same initialized nameref variable may refer to variables in 
different scopes depending on the context where it is used. When used in an 
assignment, a nameref will look for the variable it references starting at the 
scope where the nameref itself was found, which may be below the current scope. 
In a parameter substitution a nameref will always start looking for its target 
in the current scope.

This leads to unintuitive behavior where a variable appears to not change its 
value after an assignment. Example:

x=global
foo() { declare -n xr=x ; bar; }
bar() { local x=bar ; xr=new ; echo "after xr=new xr is $xr"; }
foo
echo "x at top level is $x"

Output:

> after xr=new xr is bar
> x at top level is new

This is with bash-5.1.8 and 5.1.16.

It is completely unexpected that "xr=new; echo $xr" may print something other 
than "new". If this is intentional, I would be curious to know the rationale. 
It would also help a lot if you spelled out the nameref lookup rules in the man 
page.

On a side note, my personal preference would be to have namerefs remember the 
scope of the variable that they reference (once that scope is known), and 
always resolve to that same SHELL_VAR, until unset, instead of looking up a 
SHELL_VAR by name, as is the current behavior. Among other things, this would 
solve the dreaded conflict between namerefs and dynamic scoping that has been 
discussed extensively on this list, most recently in  
https://lists.gnu.org/archive/html/bug-bash/2020-08/msg00206.html. I use 
namerefs extensively in a fairly large Bash code base for parameter passing, 
and I have to use fairly elaborate work-arounds to detect local variables 
shadowing outer-scope variables that the function operates on via namerefs.

-Mark



DEBUG trap in a background shell steals controlling terminal forcing parent shell to exit

2024-06-18 Thread Mark March
I am working with a large Bash code base where most scripts disable job control 
and the DEBUG trap is used extensively. I noticed that if I tried to run my 
scripts in the background, the interactive shell that started them would 
immediately exit on any keyboard input. A simple repro is to run

bash +m -c "/bin/echo ; trap 'trap DEBUG' DEBUG ; sleep 10" &
in an interactive shell with job control enabled. Hit Enter a few times. The 
shell that launched this background process exits. The background process 
itself appears to be killed by a signal. I was able to repro this with and 
under Bash-5.2.21 and 5.1.16 on Ubuntu 22.04.4, and with Bash-5.2.15 on MacOS.

The problem seems to be triggered by the following code in run_debug_trap():
if (pipeline_pgrp > 0 && ((subshell_environment & 
(SUBSHELL_ASYNC|SUBSHELL_PIPE)) == 0))
    give_terminal_to (pipeline_pgrp, 1);

give_terminal_to() calls tcsetpgrp (shell_tty, pgrp), which places the calling 
process in the foreground without the parent shell's knowledge. Since the 
parent shell's process group is no longer in the foreground, I suspect it 
receives an EIO from a read(2) and exits, although I was not able to confirm 
this with strace.
In my repro pipeline_pgrp is non-zero at the time of DEBUG trap execution. gdb 
shows that it was set to shell_pgrp in make_child() that forked /bin/echo 
without job control (+m). The other condition of the if statement is also 
satisfied.

-Mark


Re: DEBUG trap in a background shell steals controlling terminal forcing parent shell to exit

2024-06-25 Thread Mark March
 Bump in case this fell through the cracks. My simple script that uses no job 
control facilities (in fact, turns job control off) if run in the background 
would log the user out on any keyboard input. This can't be right. Can you at 
least confirm that this is indeed a bug in Bash? Are there any work-arounds 
other than not using the DEBUG trap or leaving job control on?

-Mark

On Tuesday, June 18, 2024 at 02:48:51 PM PDT, Mark March 
 wrote:  
 
 I am working with a large Bash code base where most scripts disable job 
control and the DEBUG trap is used extensively. I noticed that if I tried to 
run my scripts in the background, the interactive shell that started them would 
immediately exit on any keyboard input. A simple repro is to run

bash +m -c "/bin/echo ; trap 'trap DEBUG' DEBUG ; sleep 10" &
in an interactive shell with job control enabled. Hit Enter a few times. The 
shell that launched this background process exits. The background process 
itself appears to be killed by a signal. I was able to repro this with and 
under Bash-5.2.21 and 5.1.16 on Ubuntu 22.04.4, and with Bash-5.2.15 on MacOS.

The problem seems to be triggered by the following code in run_debug_trap():
if (pipeline_pgrp > 0 && ((subshell_environment & 
(SUBSHELL_ASYNC|SUBSHELL_PIPE)) == 0))
    give_terminal_to (pipeline_pgrp, 1);

give_terminal_to() calls tcsetpgrp (shell_tty, pgrp), which places the calling 
process in the foreground without the parent shell's knowledge. Since the 
parent shell's process group is no longer in the foreground, I suspect it 
receives an EIO from a read(2) and exits, although I was not able to confirm 
this with strace.
In my repro pipeline_pgrp is non-zero at the time of DEBUG trap execution. gdb 
shows that it was set to shell_pgrp in make_child() that forked /bin/echo 
without job control (+m). The other condition of the if statement is also 
satisfied.

-Mark
  


Re: DEBUG trap in a background shell steals controlling terminal forcing parent shell to exit

2024-07-01 Thread Mark March
 Thank you for a quick fix. Bash 5.2.21 with the patch applied no longer 
exhibits the problem in my tests.
-Mark

On Thursday, June 27, 2024 at 06:05:28 AM PDT, Chet Ramey 
 wrote:  
 
 On 6/18/24 4:55 PM, Mark March wrote:
> I am working with a large Bash code base where most scripts disable job 
> control and the DEBUG trap is used extensively. I noticed that if I tried to 
> run my scripts in the background, the interactive shell that started them 
> would immediately exit on any keyboard input. A simple repro is to run
> 
> bash +m -c "/bin/echo ; trap 'trap DEBUG' DEBUG ; sleep 10" &
> in an interactive shell with job control enabled. Hit Enter a few times. The 
> shell that launched this background process exits. The background process 
> itself appears to be killed by a signal.

Thanks for the report. The attached patch should fix your issue, though
there might be some corner cases. Let me know.

Chet
-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
        ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    c...@case.edu    http://tiswww.cwru.edu/~chet/

  


Re: waiting for process substitutions

2024-07-02 Thread Mark March via Bug reports for the GNU Bourne Again SHell
 lastpipe takes effect only if job control is off. From the man page:

    lastpipe
  If set, and job control is not active,  the  shell
With laspipe on and job control off, 'wait' appears to wait for all process 
substitutions:
set +mshopt -s lastpipe
echo $'foo\nbar' | tee >(echo first ; exit 1) >(wc ; sleep 10 ; echo wc) >(tail 
-n 1; echo tail)wait

blocks for 10s under Bash 5.2.21.

Likely just a documentation bug.
-Mark

On Saturday, June 29, 2024 at 07:52:34 PM PDT, Zachary Santer 
 wrote:  
 
 On Sat, Jun 29, 2024 at 2:07 PM Oğuz  wrote:
>
> There is a limit to the number of jobs Bash can remember, once it's exceeded 
> the oldest job is overwritten. Do we really want process substitutions to 
> count against that limit?

They might already. Now I'm wondering if the documentation just needed updating.

I'm afraid to report this as a bug, because it feels like something
that running bash in MSYS2 on Windows could be responsible for, but
here goes.

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: msys
Compiler: gcc
Compilation CFLAGS: -march=nocona -msahf -mtune=generic -O2 -pipe
-D_STATIC_BUILD
uname output: MINGW64_NT-10.0-19045 Zack2021HPPavilion 3.5.3.x86_64
2024-06-03 06:22 UTC x86_64 Msys
Machine Type: x86_64-pc-msys

Bash Version: 5.2
Patch Level: 26
Release Status: release

Description:

So bash can wait on process substitutions.

1) When all child processes are process substitutions:
a. wait without arguments actually appears to wait for all of them,
not just the last-executed one, contradicting the man page.
b. A subsequent call to wait listing all child process pids
immediately terminates successfully.
c. If calling wait -n in the middle of all this, whether listing only
un-waited-on child process pids or all child process pids, it lists
all argument pids as "no such job" and terminates with code 127. This
is probably incorrect behavior.

2) When a standard background process is added:
a. wait without arguments waits for all child processes.
b. A subsequent call to wait listing all child process pids lists all
argument pids as not children of the shell and terminates with code
127. This seems incorrect, or at least the change in behavior from 1b.
is unexpected.
c. If calling wait -n in the middle of all this, we see that it only
lists the pids from process substitutions as "no such job".

Repeat-By:

./procsub-wait false false
./procsub-wait false true
./procsub-wait true false
./procsub-wait true true

procsub-wait and the results of running it on the listed system are attached.