On 9/20/15 3:45 PM, Stephane Chazelas wrote: > 2015-09-20 17:12:45 +0100, Stephane Chazelas: > [...] >> I thought the termsig_handler was being invoked upon SIGINT as >> the SIGINT handler, but it is being called explicitely by >> set_job_status_and_cleanup so the problem is elsewhere. >> >> child_caught_sigint is 0 while if I understand correctly it >> should be 1 for a cmd that calls exit() upon SIGINT. So that's >> probably probably where we should be looking. > [...] > > I had another look. > > If we're to beleive gdb, child_caught_sigint is 0 because > waitpid() returns without EINTR even though wait_sigint_received > is 1. > > The only reasonable explanation I can think of is that the child > handles its SIGINT first, exits which updates its state and > causes bash the parent to be scheduled, and waitpid() returns > (without EINT) and after that bash's SIGINT handler kicks in too > late.
Absent kernel problems, there are four scenarios for the child process reacting to SIGINT: 1. The SIGINT arrives before the child begins executing. 2. The SIGINT arrives while the child is executing. 3. The SIGINT arrives while the child is exiting successfully. 4. The SIGINT arrives after the child has exited but before the parent's waitpid() returns. In the first two cases, the shell's waitpid() should return -1, but the first case will probably return ECHILD while the second returns EINTR. In the third case, there's not really anything the shell can do, since there's nothing to distinguish that case from one where the child catches SIGINT and exits successfully, and your patch doesn't change things. The fourth case will, in practice, be indistinguishable from the third case, since the kernel is usually `greedy' and will not return EINTR if there is something to report. In all these cases, I assume that bash has called waitchld() and waiting_for_child == 1. If it's not, the signal handler treats the signal as it would normally, if it were not waiting for a child to exit. > > Anyway, this patch makes the problem go away for me (and > addresses my problem #2 about exit code 130 not being treated > as an interrupted child). It might break things though if there > was a real reason for bash to check for waitpid()'s EINTR. You should read http://lists.gnu.org/archive/html/bug-bash/2011-02/msg00088.html for a summary of why the test for waitpid() returning -1/EINTR exists. Linus's posts, at least the ones where there's more light than heat, are good reading. > With that patch applied, > > ./bash -c 'sh -c "trap exit INT; sleep 120; :"; echo hi' > ./bash -c 'mksh -c "sleep 120; :"; echo hi' > > Does *not* output "hi" (as mksh or sh do a exit(130) which is > regarded as them being "interrupted by that SIGINT", or at least > reporting that the child they want to report the status of > (sleep) has been killed by a SIGINT). This still counts as catching and handling the SIGINT, and the shell should not act as if the foreground process died as a result of one. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, ITS, CWRU c...@case.edu http://cnswww.cns.cwru.edu/~chet/