Re: SIGINT handling
On 24/09/15 07:20, Stephane Chazelas wrote: > 2015-09-24 07:01:23 +0100, Stephane Chazelas: >> 2015-09-23 21:27:00 -0400, Chet Ramey: >>> On 9/19/15 5:31 PM, Stephane Chazelas wrote: >>> In case it was caused by some Debian patch, I recompiled the code of 4.3.42 from gnu.org and the one from the devel branch on the git repository (commit bash-20150911 snapshot) and still: $ ./bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi' ^Chi $ ./bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi' ^Chi $ ./bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi' ^C $ ./bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi' ^Chi Sometimes (and the frequency of occurrences is erratic, generally roughly 80% of "hi"s but at times, I don't see a "hi" in a while), the "hi" doesn't show up. Note that I press ^C well after sleep has started. >>> >>> It would be nice to see a system call trace for this so we can check >>> what's going on with the timing. >> >> I don't have them logged but I did several tests in gdb >> with "handle SIGINT nostop pass" and as I said before, >> Upon the test that sets child_caught_sigint, waitpid() has not >> returned with EINTR and wait_sigint_received has been set. >> >> If I break on the SIGINT handler, I see the call trace at the >> return of the "syscall". >> >> I can try and get you a call trace later today. > [...] > > (gdb) handle SIGINT nostop pass > SIGINT is used by the debugger. > Are you sure you want to change it? (y or n) y > SignalStop Print Pass to program Description > SIGINTNoYes Yes Interrupt > (gdb) break wait_sigint_handler > Breakpoint 1 at 0x443a70: file jobs.c, line 2241. > (gdb) run > Starting program: bash-4.3/bash -c ./a\;\ echo\ x > ^C > Program received signal SIGINT, Interrupt. > > Breakpoint 1, wait_sigint_handler (sig=2) at jobs.c:2241 > 2241{ > (gdb) bt > #0 wait_sigint_handler (sig=2) at jobs.c:2241 > #1 > #2 0x776bc31c in __libc_waitpid (pid=pid@entry=-1, > stat_loc=stat_loc@entry=0x7fffdbc8, options=options@entry=0) at > ../sysdeps/unix/sysv/linux/waitpid.c:31 > #3 0x00445f3d in waitchld (block=block@entry=1, wpid=5337) at > jobs.c:3224 > #4 0x0044733b in wait_for (pid=5337) at jobs.c:2485 > #5 0x00437992 in execute_command_internal > (command=command@entry=0x70bb88, asynchronous=asynchronous@entry=0, > pipe_in=pipe_in@entry=-1, pipe_out=pipe_out@entry=-1, > fds_to_close=fds_to_close@entry=0x70bde8) at execute_cmd.c:829 > #6 0x00437b0e in execute_command (command=0x70bb88) at > execute_cmd.c:390 > #7 0x00435f23 in execute_connection (fds_to_close=0x70bdc8, > pipe_out=-1, pipe_in=-1, asynchronous=0, command=0x70bd88) at > execute_cmd.c:2494 > #8 execute_command_internal (command=0x70bd88, > asynchronous=asynchronous@entry=0, pipe_in=pipe_in@entry=-1, > pipe_out=pipe_out@entry=-1, fds_to_close=fds_to_close@entry=0x70bdc8) > at execute_cmd.c:945 > #9 0x0047955b in parse_and_execute (string=, > from_file=from_file@entry=0x4b5f96 "-c", flags=flags@entry=4) at > evalstring.c:387 > #10 0x004205d7 in run_one_command (command=) at > shell.c:1348 > #11 0x0041f524 in main (argc=3, argv=0x7fffe258, > env=0x7fffe278) at shell.c:695 > (gdb) frame 2 > #2 0x776bc31c in __libc_waitpid (pid=pid@entry=-1, > stat_loc=stat_loc@entry=0x7fffdbc8, options=options@entry=0) at > ../sysdeps/unix/sysv/linux/waitpid.c:31 > 31 ../sysdeps/unix/sysv/linux/waitpid.c: No such file or directory. > (gdb) disassemble > Dump of assembler code for function __libc_waitpid: >0x776bc300 <+0>: mov0x2f14cd(%rip),%r9d# > 0x779ad7d4 <__libc_multiple_threads> >0x776bc307 <+7>: test %r9d,%r9d >0x776bc30a <+10>:jne0x776bc336 <__libc_waitpid+54> >0x776bc30c <+12>:xor%r10d,%r10d >0x776bc30f <+15>:movslq %edx,%rdx >0x776bc312 <+18>:movslq %edi,%rdi >0x776bc315 <+21>:mov$0x3d,%eax >0x776bc31a <+26>:syscall > => 0x776bc31c <+28>:cmp$0xf000,%rax >0x776bc322 <+34>:ja 0x776bc325 <__libc_waitpid+37> >0x776bc324 <+36>:retq >0x776bc325 <+37>:mov0x2ebb3c(%rip),%rdx# > 0x779a7e68 >0x776bc32c <+44>:neg%eax >0x776bc32e <+46>:mov%eax,%fs:(%rdx) >0x776bc331 <+49>:or $0x,%rax > (gdb) fin > Run till exit from #2 0x776bc31c in __libc_waitpid > (pid=pid@entry=-1, stat_loc=stat_loc@entry=0x7fffdbc8, > options=options@entry=0) at ../sysdeps/unix/sysv/linux/waitpid.c:31 > 0x00445f3d in waitchld (block=block@entry=1, wpid=5481) at jobs.c:3224 > 3224 pid = WAITPID (-1, &status, waitpid_flags); > V
Re: SIGINT handling
2015-09-24 09:36:08 +0100, Pádraig Brady: [...] > > (gdb) handle SIGINT nostop pass [...] > > In case it's relevant, I'm not entirely sure of gdb's signal handling: > https://sourceware.org/bugzilla/show_bug.cgi?id=18364 Yes, I wondered about that. I'd expect the "handle SIGINT nostop pass", to take gdb out of the loop, but I've not verified it and I suspect ptracing could have side effects. It's easy to corroborate with printfs though here which I just did: $ ./bash -c './a; echo x' ^Cwait_sigint_received=1 pid=-1 wait_sigint_received=1 pid=956 x $ ./bash -c './a; echo x' ^Cwait_sigint_received=1 pid=958 $ diff -pu jobs.c\~ jobs.c --- jobs.c~ 2015-09-20 20:03:14.692119372 +0100 +++ jobs.c 2015-09-24 11:49:03.963122465 +0100 @@ -3262,6 +3262,7 @@ itrace("waitchld: waitpid returns %d blo require the child to actually die due to SIGINT to act on the SIGINT we received; otherwise we assume the child handled it and let it go. */ + fprintf(stderr, "wait_sigint_received=%d pid=%d\n", wait_sigint_received, pid); if (pid < 0 && errno == EINTR && wait_sigint_received) child_caught_sigint = 1; -- Stephane
Re: SIGINT handling
Given that the bug was introduced by Linus' patch (to fix a bug that anyway is in all shell implementations that do WCE) and that it's caused by a behaviour that seems to be specific to the Linux kernel (that the kernel seems to be messing up with the order of delivery of the SIGCHLD (or return from waitpid()) and SIGINT), we may want to bring the issue up to him. Here, the behaviour could be seen as a kernel bug, since the child should clearly die *after* the SIGINT has been issued to the parent (since the ^C should insert the SIGINT in the signal queue of both parent and child processes at the same time) so it's wrong for SIGINT to be handled *after* waitpid() returns. But of course one can also argue that the order of signal delivery is not guaranteed in general anyway. IMO, the best approach would be to give up on WCE altogether which is more source of frustration anyway than it has ever helped. I live very well with a /bin/sh (dash) and interactive shell (zsh) that don't do it. WCE may be good in a perfect world where everything does it (everything that calls waitpid() without using system(3)), but if not, I hardly see the point. What's the point of bash doing it when sh, find -exec, xargs, watch, git (like in that emacs bug report) don't do it. it seems to me that finding another way to address it (like emacs approach of putting itself on its own in a new forground job if it's not already a process group leader) for the rare cases where it's useful (like the vi -> :! case) would be better. -- Stephane
Re: SIGINT handling
On 9/22/15 8:18 AM, Greg Wooledge wrote: > On Mon, Sep 21, 2015 at 10:07:55PM +0100, Stephane Chazelas wrote: >> Maybe the test scenario was not clear: >> >> bash -c 'cmd; echo hi' >> >> is run from an interactive shell, cmd is a long running >> application (the problem that sparked this discussion was with >> ping and I showed examples with an inline-script calling sleep) > > Just for the record, ping is the *classic* example of an incorrectly > written application that traps SIGINT but doesn't kill itself with > SIGINT afterward. (This seems to be true on multiple systems -- at > the very least, HP-UX and Linux pings both suffer from it.) > > A loop like this works as expected: > > while true; do > sleep 1 > done > > A loop like this does not: > > while true; do > ping -c 1 some.host # or on HP-UX, ping some.host -n 1 > done If you decide, as bash has, to allow the foreground job to determine what to do with SIGINT, you have to cope with software like ping. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/
Re: SIGINT handling
On 9/24/15 9:57 AM, Stephane Chazelas wrote: > IMO, the best approach would be to give up on WCE altogether > which is more source of frustration anyway than it has ever > helped. I live very well with a /bin/sh (dash) and interactive > shell (zsh) that don't do it. We'll agree to disagree. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/
Re: SIGINT handling
2015-09-24 14:53:16 -0400, Chet Ramey: > On 9/24/15 9:57 AM, Stephane Chazelas wrote: > > > IMO, the best approach would be to give up on WCE altogether > > which is more source of frustration anyway than it has ever > > helped. I live very well with a /bin/sh (dash) and interactive > > shell (zsh) that don't do it. > > We'll agree to disagree. [...] Now that we're settled on WCE, would you agree that a=$(cmd-that-catches-sigint) should behave like (cmd-that-catches-sigint) (as in, not exit the shell as per WCE)? What about $PIPESTATUS? In: cmd-that-catches-sigint | cmd-that-does-not or cmd-that-does-not | cmd-that-catches-sigint Should we exit on SIGINT or leave that command run in background? Should pipefail have an influence on the behaviour? What about lastpipe? What about when using the wait builtin? Why should: cmd & wait "$!" be treated differently from cmd ? Because cmd's stdin is /dev/null and so is unlikely to be an interactive command? So we admit WCE is a kludge -- Stephane
Re: SIGINT handling
On 9/21/15 5:07 PM, Stephane Chazelas wrote: > The problem is that here the parent's SIGINT handler is run upon > the return from waitpid(), just after. My patch doesn't rely on > EINTR from waitpid() (which doesn't happen here, waitpid() returns > with the pid of the child that did an exit() upon receiving > SIGINT), just on the "status" returned by the child, so doesn't > have the problem. I wonder if the kernel is restarting the waitpid() even though the signal handler was installed without SA_RESTART. > What do you suggest we do to fix that issue? I think your additional test for wait_sigint_received coupled with a check that the child died for some other reason than SIGINT, in addition to the EINTR test, is a reasonable fix. >> This still counts as catching and handling the SIGINT, and the shell >> should not act as if the foreground process died as a result of one. > > That's the point I'm arguing on. > > If the command handled SIGINT and returned with 130, I argue it > is considering itself and telling its parent as having been > "interrupted" No, it's not. If a shell exits with status 130, it's saying that the last command it executed was killed by SIGINT, or happened to exit with status 130 for some random reason. It's very possible for a non-interactive shell to restore its original SIGINT handler and resend SIGINT to itself, if it's concerned about telling its parent that it's been interrupted. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/
[PATCH] input: fix logic bug that breaks on O_TEXT systems
The bash-20101229 snapshot introduced an attempt to incorporate some patches that were previously used downstream by Cygwin with regards to managing scripts read with O_TEXT mode (where lseek() sees different offsets than the number of bytes read, because the system is converting physical '\r\n' into logical '\n'). But it introduced a typo, using O_TEXT where it meant to use B_TEXT, with the result that scripts read in text mode would seek to the wrong location. Thanks to Jeff Downs for helping me find the typo. Signed-off-by: Eric Blake --- input.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/input.c b/input.c index 2a9ca26..308b87e 100644 --- a/input.c +++ b/input.c @@ -205,7 +205,7 @@ make_buffered_stream (fd, buffer, bufsize) if (bufsize == 1) bp->b_flag |= B_UNBUFF; if (O_TEXT && (fcntl (fd, F_GETFL) & O_TEXT) != 0) -bp->b_flag |= O_TEXT; +bp->b_flag |= B_TEXT; return (bp); } -- 2.4.3
Re: [PATCH] input: fix logic bug that breaks on O_TEXT systems
On 9/24/15 5:47 PM, Eric Blake wrote: > The bash-20101229 snapshot introduced an attempt to incorporate > some patches that were previously used downstream by Cygwin > with regards to managing scripts read with O_TEXT mode (where > lseek() sees different offsets than the number of bytes read, > because the system is converting physical '\r\n' into logical > '\n'). But it introduced a typo, using O_TEXT where it meant > to use B_TEXT, with the result that scripts read in text mode > would seek to the wrong location. Good catch. Thanks for the report and fix. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/