On 07/26/2012 11:37 PM, Michael Haubenwallner wrote: > On 07/26/12 20:29, Chet Ramey wrote: >> OK, we have some data, we have a hypothesis, and we have a way to test it. >> Let's test it. >> >> Michael, please apply the attached patch, disable RECYCLES_PIDS, and run >> your tests again. This makes the check for previously-saved exit statuses >> unconditional. >> >> Let's see if this is the one change of significance. > > Nope, doesn't fix the problem, even if it might be necessary though > to not mix up stored exitstates.
For mixing up stored exitstates: This patch isn't enough to get below testcase working reliably - it also is necessary to drop the pid_wrap detection, as pids aren't guaranteed to be (re)used in any particular order. However, this highly depends on the machine's load. With attached patch I haven't been able to break the testcase below so far on that AIX 6.1 box here. But still, the other one using the $()-childs still fails. --- for job in {128..511} {0..127} do if [[ ${job} -lt 128 ]]; then ( exit 17 ) & else ( exit 1 ) & fi eval "pidof_${job}=\$!" done for job in {127..0}; do pid=pidof_${job} pid=${!pid} wait ${pid} ret=$? if [ ${ret} -ne 17 ]; then echo "job ${job} failed with ret ${ret}" fi done --- Thank you! /haubi/ (away for next 3 weeks)
*** jobs.c.orig 2012-07-27 15:29:54.283862562 +0200 --- jobs.c 2012-07-27 15:29:51.960238374 +0200 *************** *** 1897,1903 **** --- 1897,1906 ---- #endif + #if 0 if (pid_wrap > 0) + #endif delete_old_job (pid); + #if 0 #if !defined (RECYCLES_PIDS) /* Only check for saved status if we've saved more than CHILD_MAX *************** *** 1905,1908 **** --- 1908,1912 ---- if ((js.c_reaped + bgpids.npid) >= js.c_childmax) #endif + #endif bgp_delete (pid); /* new process, discard any saved status */