Re: AIX and Interix also do early PID recycling.

Michael Haubenwallner Tue, 24 Jul 2012 09:55:00 -0700

On 07/24/2012 05:49 PM, Greg Wooledge wrote:
> On Tue, Jul 24, 2012 at 05:03:36PM +0200, michael.haubenwall...@salomon.at 
> wrote:
>> Description:
>>      On AIX (5.3, 6.1, 7.1), as well as on Interix (any version) I do 
>> encounter
>>      some race condition in a code similar to:
>>          if grep "unwanted" /some/nonexistent/filename
>>          then
>>            echo "bad"
>>            exit 1
>>          fi
>>          echo "good"


There is nothing multiprocessing nor asynchronous in this script snippet, there
isn't even a pipe or subshell somewhere. Copy&pasted the code actually is:

    # Verify that the libtool files don't contain bogus $D entries.
    local abort=no gentoo_bug=no
    for a in "${ED}"usr/lib*/*.la ; do
        s=${a##*/}
        if grep -qs "${D}" "${a}" ; then
                vecho -ne '\a\n'
                eqawarn "QA Notice: ${s} appears to contain PORTAGE_TMPDIR 
paths"
                abort="yes"
        fi
    done
    [[ ${abort} == "yes" ]] && die "soiled libtool library files found"

When it erroneously fails, the message is "QA Notice: *.la appears to contain 
...",
however there is no filename '*.la'.

Agreeed, a bug here is that nullglob should be set to not run grep at all when
there is no *.la file, but this just would hide the bash bug...

>>      Sometimes it does "bad" while it should do "good" always.
> 
> If that happens, then I don't see how it is related to recyling PIDs.
> In fact, if grep is failing to produce the correct result, it it a
> problem with your OS's implementation of grep, and not with bash.

Adding some debug-printfs to bash itself around fork, execve, waitpid shows:

Bash usually does fork()+execve("grep"), as well as waitpid(-1, ...).

Whenever waitpid() returns this "grep" one's PID, the reported exitstatus
always is 2, even when bash goes "bad"...

Adding more debug-printfs to bash's wait_for() and execute_command_internal() 
shows:

Usually, execute_command_internal() does wait_for(this one grep's PID) before
executing anything else, correctly evaluating the returnvalue to be "not true",
skipping the "bad" part.

But when there was some previous but unrelated command, where fork() returned 
the
same PID than for this "grep", execute_command_internal() does /not/ wait_for() 
at all,
because last_made_pid is equal to last_pid, and the path to "bad" is gone 
instead,
as another exitstatus is evaluated instead of this grep's one.

However, in a subsequent wait_for(another child), waitpid() does report 
exitstatus 2
for this grep's PID, but bash has gone "bad" already and ignores that 
exitstatus.

>>                [[ ${#last} > 4 ]] && used=used_${last::((${#last}-4))} || 
>> used=used_0
> 
> That [[ ${#last} > 4 ]] check is incorrect.  You're doing a string
> comparison there; [[ 10 > 4 ]] is false.  Either use ((...)) or use -gt.

Indeed! (but irrelevant - is just a performance optimisation)

> In any case, if your script breaks because PIDs are recycled sooner than
> you expect, then it is a bug in your script, and not in bash.

It's not me nor my script to expect anything about PIDs at all here.

> (What
> would you expect bash to do about it in the first place?)  It may also
> interest you to know that there are some operating systems that use
> random PID allocation, instead of sequential (OpenBSD for example).

PID randomisation isn't a problem at all, as long as a previously used PID
is not reused too early.

> http://mywiki.wooledge.org/ProcessManagement has some tips on how to
> deal with multiple processes.

Interesting page, but there's nothing that applies here.

Thank you anyway!
/haubi/

Re: AIX and Interix also do early PID recycling.

Reply via email to