wait fails to grab exit codes with multiple process substitutions

leo . dalecki Sat, 14 Sep 2019 13:52:50 -0700

From: leo.dale...@ntymail.com
To: bug-bash@gnu.org
Subject: wait fails to grab exit codes with multiple process substitutions


Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS: -g -O2 -fdebug-prefix-map=/build/bash-2bxm7h/bash-5.0=. 
-fstack-protector-strong -Wformat -Werror=format-security -Wall 
-Wno-parentheses -Wno-format-security
uname output: Linux debian 4.19.0-5-amd64 #1 SMP Debian 4.19.37-5+deb10u2 
(2019-08-08) x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 5.0
Patch Level: 3
Release Status: release

Description:
        Hi,

        I was trying to write a script that could run an arbitrary amount of 
asynchronous commands, and then as soon as one finishes, perform actions
        according to its exit code and output. 

        I attempted to do this by using process substitution and the wait 
command. But wait's ability to retrieve the exit code of a finished process 
        that is a current shell's children seems inconsistent.

        The script starts ten asynchronous commands by redirecting a process 
substitution's file descriptor to one of the current shell using exec.
        The script then uses wait with the process IDs of the asynchronous 
tasks to retrieve their exit codes.

        wait fails on nine of the ten tasks, giving the error "wait: pid X is 
not a child of this shell". But weirdly enough, it succeeds when retrieving 
        the exit code of the last asynchronous task to be run.

        The exact script I use is included in Repeat-By section, as a way to 
trigger the bug.

        Thank you and have a nice day.

Repeat-By:
        The following script reproduces the behavior I described in the 
Description section:

        
        #!/bin/bash
        
        # bg.sh
        
        # Executing commands asynchronously, retrieving their exit codes and 
outputs upon completion.
        
        asynch_cmds=
        
        echo -e "Asynchronous commands:\nPID    FD"
        
        for i in {1..10}; do
                exec {fd}< <(sleep $(( i * 2 )) && echo $RANDOM && exit $i) # 
Dummy asynchronous task, standard output's stream is redirected to the current 
shell
                asynch_cmds+="$!:$fd " # Append the task's PID and FD to the 
list of running tasks
        
                echo "$!        $fd"
        done
        
        echo -e "\nExit codes and outputs:\nPID FD      EXIT    OUTPUT"
        
        while [[ ${#asynch_cmds} -gt 0 ]]; do # While the list of running tasks 
isn't empty
        
                for asynch_cmd in $asynch_cmds; do # For each task in the list
        
                        pid=${asynch_cmd%:*} # Task's PID
                        fd=${asynch_cmd#*:} # Task's FD
        
                        if ! kill -0 $pid 2>/dev/null; then # If the task ended
        
                                wait $pid # Retrieving the task's exit code
                                echo -n "$pid   $fd     $?      "
        
                                cat <&$fd # Retrieving the task's output
        
                                asynch_cmds=${asynch_cmds/$asynch_cmd /} # 
Removing the task from the list
                        fi
                done
        done


        By running this script, you will see that wait gives the error I 
described. And also that the tenth task (with exit code 10) doesn't trigger the 
wait error.

wait fails to grab exit codes with multiple process substitutions

Reply via email to