[BUG] Bash not reacting to Ctrl-C
l.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id p0SIb2Bh002078; Fri, 28 Jan 2011 13:37:02 -0500 Received: from blackscsi.openrapids.net (mail.openrapids.net [64.15.138.104]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id p0SIaqbY027778; Fri, 28 Jan 2011 13:36:53 -0500 Received: from localhost (localhost [127.0.0.1]) by blackscsi.openrapids.net (Postfix) with ESMTP id AB30C140209; Fri, 28 Jan 2011 13:36:51 -0500 (EST) Received: from blackscsi.openrapids.net ([127.0.0.1]) by localhost (blackscsi.openrapids.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EgDquPjv+8Tc; Fri, 28 Jan 2011 13:36:50 -0500 (EST) Received: by blackscsi.openrapids.net (Postfix, from userid 1003) id B8815141336; Fri, 28 Jan 2011 13:36:50 -0500 (EST) Date: Fri, 28 Jan 2011 13:36:50 -0500 From: Mathieu Desnoyers To: Anca Emanuel Cc: Thomas Gleixner , Ingo Molnar , Tejun Heo , rol...@redhat.com, o...@redhat.com, jan.kratoch...@redhat.com, linux-ker...@vger.kernel.org, torva...@linux-foundation.org, a...@linux-foundation.org, Peter Zijlstra , =?iso-8859-1?Q?Fr=E9d=E9ric?= Weisbecker Subject: Re: [PATCHSET] ptrace,signal: group stop / ptrace updates Message-ID: <20110128183650.GA26633@Krystal> References: <1296227324-25295-1-git-send-email...@kernel.org> <20110128165455.ga18...@elte.hu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Editor: vi X-Info: http://www.efficios.com X-Operating-System: Linux/2.6.26-2-686 (i686) X-Uptime: 13:29:41 up 65 days, 23:32, 1 user, load average: 0.19, 0.09, 0.05 User-Agent: Mutt/1.5.18 (2008-05-17) X-RedHat-Spam-Score: -0.01 (T_RP_MATCHES_RCVD) X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 X-Scanned-By: MIMEDefang 2.67 on 10.5.110.12 Status: RO Content-Length: 1563 Lines: 43 * Anca Emanuel (anca.eman...@gmail.com) wrote: > On Fri, Jan 28, 2011 at 7:41 PM, Thomas Gleixner wrote: > > On Fri, 28 Jan 2011, Ingo Molnar wrote: > >> See that '^C^C' line? That is where i had to do Ctrl-C twice. > >> > >> It only fails here about once every 10 times, so it's very rare. I have a > >> stock F14 > >> system running on that box, with the very latest .38 based kernel. > > > > Tripped over the refuse ^C thing today twice. Had to kill a kernel > > build from another shell. It just happily displayed ^C and never > > stopped. That happens once in a while and I have no idea either how to > > debug that. > > cc: Mathieu > > Use lttng ? Heh :) I'm sure Ingo and Thomas have their own tools for that ;) There is one extra thing in the LTTng instrumentation that can help solve this problem: the "input subsystem" instrumentation (enabled with ltt-armall -i). You can then get a dump of: - Your keystrokes (you can then grep for your ctrl-c input) - Read/poll/select system calls (so you know when your terminal receives the input). - Signals sent/delivered Some of these are already instrumented in the mainline kernel, so you might get away without the input subsystem instrumentation. If I had to take a wild guess, my bet would be to take a look in the area of signal delivery, but you never know, maybe it's a userspace bug in the X terminal emulator code that is causing this weirdness. Hope this helps, Mathieu -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com >From o...@redhat.com Fri Jan 28 18:55:32 2011 Date: Fri, 28 Jan 2011 18:55:33 +0100 From: Oleg Nesterov To: Ingo Molnar Cc: Tejun Heo , rol...@redhat.com, jan.kratoch...@redhat.com, linux-ker...@vger.kernel.org, torva...@linux-foundation.org, a...@linux-foundation.org, Peter Zijlstra , Thomas Gleixner , =?iso-8859-1?Q?Fr=E9d=E9ric?= Weisbecker Subject: Re: [PATCHSET] ptrace,signal: group stop / ptrace updates Message-ID: <20110128175532.ga26...@redhat.com> References: <1296227324-25295-1-git-send-email...@kernel.org> <20110128165455.ga18...@elte.hu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110128165455.ga18...@elte.hu> User-Agent: Mutt/1.5.18 (2008-05-17) Status: RO Content-Length: 2436 Lines: 66 On 01/28, Ingo Molnar wrote: > > The bug is that occasionally Ctrl-C does not get processed, and that the > Ctrl-C is > 'lost'. It can be reproduced here by running ./test-signal several times, and > Ctrl-C-ing it: > > $ ./test-signal > ^C > $ ./test-signal > ^C^C > $ ./test-signal > ^C > > See that '^C^C' line? That is where i had to do Ctrl-C twice. Reproduced. At first glance, /bin/sh should be blamed... Hmm, probably yes, I even reproduced this under strace, and this is what I see
Re: [BUG] Bash not reacting to Ctrl-C
On 02/08, Chet Ramey wrote: > > On 2/8/11 1:21 PM, Oleg Nesterov wrote: > > Hello, > > > > We believe that the non-interactive bash doesn't handle CTRL-C > > correctly, please look into the attached thread from lkml for > > more details. > > Read http://www.cons.org/cracauer/sigint.html oooh... it is huge! will try tomorrow. > and see if you still > feel the same way. Which way? ;) Please note that I wasn't sure when I sent this bug-report. Although as a bash user I certainly dislike the fact you can never interrupt the shell script reliably. Lets return to the first example, $ sh -c 'while true; do /bin/true; done' Do you think it is OK to miss ^C in this case? Once again, I won't persist if you think this is fine, and I'll try to read the docs above tomorrow. But I'll appreciate very much if you can explain why exactly this is fine. So far I am looking at WUE shell would not have this problem, since they discontinue the script on their own. But as I said, they don't support programs using SIGINT for non-exiting purposes part of the documentation, but can't understand. Oleg.
Re: [BUG] Bash not reacting to Ctrl-C
On 02/08, Bob Proulx wrote: > > Oleg Nesterov wrote: > > $ sh -c 'while true; do /bin/true; done' > > Be careful that 'sh' is actually 'bash'. It isn't on a lot of > machines. To ensure that you are actually running bash you should > call bash explicitly. (At least we can't assume you are running bash > otherwise.) It is. In fact I did "./bash" while testing. > Is the behavior you observe any different for this case? > > $ bash -c 'while true; do /bin/true || exit 1; done' > > Or different for this case? > > $ bash -e -c 'while true; do /bin/true; done' The same. I do not know what "-e" does (and I can't find it in man), but how this can make a difference? Once again. If bash gets ^C and at the same time the current foreground child exits normally (either because this jctl signal races with exit() or because the child hooks SIGINT and exits after that) SIGINT is lost. set_job_status_and_cleanup() insists that WTERMSIG(child->status) should be SIGINT, iow the child should be killed by the same signal. Otherwise it is not going to kill itself, and the next wait_for() clears wait_sigint_received. This all looks intentional, but this means ^C can never work reliably. Oleg.
Re: [BUG] Bash not reacting to Ctrl-C
On 02/08, Chet Ramey wrote: > > On 2/8/11 4:17 PM, Oleg Nesterov wrote: > > > Once again. If bash gets ^C and at the same time the current foreground > > child exits normally (either because this jctl signal races with exit() > > or because the child hooks SIGINT and exits after that) SIGINT is lost. > > > > set_job_status_and_cleanup() insists that WTERMSIG(child->status) should > > be SIGINT, iow the child should be killed by the same signal. Otherwise > > it is not going to kill itself, and the next wait_for() clears > > wait_sigint_received. > > > > This all looks intentional, but this means ^C can never work reliably. > > It depends on what you mean by `reliably'. Sure, I understand that it is not that simple. > Consider a script that runs > emacs, then does other processing when emacs completes. Emacs uses SIGINT > internally to interrupt editing commands, but handles it and does not exit > as a result. Since emacs is run from a script, and job control is not > enabled, the shell receives the SIGINT also, because it is in the > terminal's foreground process group. Should the shell abort the script > when emacs exits? In my opinion - it should. But yes, I know almost nothing about jctl (at least the non-kernel part), and I agree this behaviour can confuse a user too. That is why I provided another test-case, let me repeat it: #!./bash perl -we '$SIG{INT} = sub {exit}; sleep' echo "Hehe, I am going to sleep after ^C" sleep 100 If a user presses ^C the shell can't know what he wants, kill the script or send the signal to the current job. However. I think the shell should react and exit. Exactly because it runs in the same foreground process group. If the user doesn't want this behaviour he can change the script, say, #!./bash trap true SIGINT perl -we '$SIG{INT} = sub {exit}; sleep' trap - SIGINT echo "OK, WCE mode makes sense sometime" sleep 100 Better yet, perhaps bash can have the new command/builtin which does setpgid() and TIOCSPGRP before running the command. Oleg.
Re: [BUG] Bash not reacting to Ctrl-C
On 02/08, Chet Ramey wrote: > > On 2/8/11 7:11 PM, Ingo Molnar wrote: > > > > Oleg also found another simple testcase i think - and Thomas (Cc:-ed) > > reported > > similar Ctrl-C problems with Bash as well. > > I tried to reproduce it and wasn't able to. I use Mac OS X. Strange, but I know nothing about Mac OS... Hmm. Do you mean the "perl -e" test-case doesn't work too ? Did you try other test-cases from http://marc.info/?l=linux-kernel&m=129623373208782 (this message was attached) ? OK, another test-case, #!./bash perl -we 'kill INT, getppid' echo "Hehe, I am going to sleep after ^C" sleep 100 ("perl -e" just sends SIGINT to the parent) To clarify, I do not claim this particular case "proves" the shell is buggy. Just to illustrate the problem: the shell refuses to exit unless the child was killed by SIGINT too. Oleg.
Re: [BUG] Bash not reacting to Ctrl-C
On 02/09, Bob Proulx wrote: > > Oleg Nesterov wrote: > > Bob Proulx wrote: > > > Is the behavior you observe any different for this case? > > > $ bash -c 'while true; do /bin/true || exit 1; done' > > > Or different for this case? > > > $ bash -e -c 'while true; do /bin/true; done' > > > > The same. > > I expected that to behave differently for you because I expected that > the issue was that /bin/true was being delivered the signal but the > exit status of /bin/true is being ignored in your test case. In your > test case if /bin/true caught the SIGINT then I expect the loop to > continue. Since you were saying that it was continuing then that is > what I was expecting was happening. Well, it is too late for me ;) perhaps I misunderstood your point. But I think this doesn't matter, see below. > > I do not know what "-e" does (and I can't find it in man), but how > > this can make a difference? > > The documentation says this about -e: > > [... snip ...] Aha, thanks a lot. > Using -e would cause the shell to exit if /bin/true returned a > non-zero exit status. /bin/true would exit non-zero if it caught a > SIGINT signal. If /bin/true gets SIGINT - everything is fine. With this particular test-case the problem is: ^C race race with true/false/whatever doing exit(any_exit_code). In this case the shell "ignores" the signal. Oleg.
Re: [BUG] Bash not reacting to Ctrl-C
On 02/09, Bob Proulx wrote: > > Ingo Molnar wrote: > > Could you try the reproducer please? > > > > Once you run it, try to stop it via Ctrl-C, and try to do this a > > couple of times. > > I was not able to reproduce your problem using your (I believe to be > slightly incorrect) test case: > > bash -c 'while true; do /bin/true; done' > > It was always interrupted with a single control-C on my amd64 Debian > Squeeze machine. I expect this means that by chance it was always > bash running in the foreground process and /bin/true never happened to > be there at the right time. > > > Do you consider it normal that it often takes 2-3 Ctrl-C attempts to > > interrupt that script, that it is not possible to stop the script > > reliably with a single Ctrl-C? > > Since the exit status of /bin/true is ignored then I think that test > case is flawed. I think at the least needs to check the exit status > of the /bin/true process. > > bash -c 'while true; do /bin/true || exit 1; done' Perhaps I misread job.c (this is very posible). But afaics bash always checks "status" after waitpid(&status), and the exit code does not matter at all. What does matter is whether WIFSIGNALED() and WTERMSIG() == SIGINT or not. Oleg.
Re: [BUG] Bash not reacting to Ctrl-C
On 02/09, Bob Proulx wrote: > > Oleg Nesterov wrote: > > > That is why I provided another test-case, let me repeat it: > > Sorry but I missed seeing that the first time through or I would have > commented. > > > #!./bash > > perl -we '$SIG{INT} = sub {exit}; sleep' > > echo "Hehe, I am going to sleep after ^C" > > sleep 100 > > This test case is flawed in that as written perl will eat the signal > and ignore it. It isn't fair to explicitly ignore the signal. Sure! But you misunderstood. This test-case does not try to prove that bash is buggy. Quite contrary, I created it exactly because I started to suspect that the current behaviour is probably intentional, at least partly. And, it illustrates how and why the test-case with /bin/true can miss a signal. Because, from /bin/sh pov "eat the signal and exit" does not differ from another case: ^C races with do_exit(). Oleg.
Re: [BUG] Bash not reacting to Ctrl-C
On 02/11, Illia Bobyr wrote: > > Do we really need to check wait_sigint_received here? > If the child exits because of SIGINT was indeed received all the > processes on the same terminal will also receive it. Only if SIGINT was sent to pgrp (like ^C sends SIGTERM to the foreground process group). > --- bash-4.1/jobs.c~ctrlc_exit_race 2011-02-07 13:52:48.0 +0100 > +++ bash-4.1/jobs.c 2011-02-07 13:55:30.0 +0100 > @@ -3299,7 +3299,7 @@ set_job_status_and_cleanup (job) >signals are sent to process groups) or via kill(2) to the foreground >process by another process (or itself). If the shell did receive the >SIGINT, it needs to perform normal SIGINT processing. */ > - else if (wait_sigint_received&& (WTERMSIG (child->status) == SIGINT)&& > + else if ((WTERMSIG (child->status) == SIGINT)&& The problems is, if WTERMSIG() == SIGINT everything is fine. Quite contrary, we need to handle the case when the last running command was _not_ killed but exited on its own. Oleg.
Re: [BUG] Bash not reacting to Ctrl-C
On 02/11, Chet Ramey wrote: > > You do realize that this case is indistinguishable from the original > scenario in question: the child gets the SIGINT, handles it, and exits > successfully (or not). I already tried to discuss this, but you didn't reply ;) See http://www.mail-archive.com/bug-bash@gnu.org/msg08528.html So, if I understand correctly, you mean that #!/bin/sh interactive_application echo DONE shouldn't be interrupted by SIGINT after interactive_application exits. For example, it can be a text-editor which treats SIGINT specially. But, in this case, shouldn't we fix the script above? In this case the shell and the application should not run in the same tty->pgrp group, or we can add "trap SIGINT". Oleg.
Re: [BUG] Bash not reacting to Ctrl-C
On 02/11, Chet Ramey wrote: > > In the meantime, read Martin Cracauer's description of the issue. > http://www.cons.org/cracauer/sigint.html. I did. OK, OK, I didn't ;) I stopped the reading immediately after I started to think I understand why you sent me this link. Oleg.
Re: [BUG] Bash not reacting to Ctrl-C
On 02/11, Linus Torvalds wrote: > > @@ -2424,6 +2425,18 @@ wait_for (pid) > sigaction (SIGCHLD, &oact, (struct sigaction *)NULL); > sigprocmask (SIG_SETMASK, &chldset, (sigset_t *)NULL); > # endif > + /* If the waitchld returned EINTR, and the shell got a SIGINT, > + then the child has not died yet, and we assume that the > + child has blocked SIGINT. In that case, we require that the > + child return with WSIGTERM() == SIGINT to actually consider > + the ^C relevant. This is racy (the child may be in the > + process of exiting and bash reacted to the EINTR first), > + but this makes the race window much much smaller */ OK, I leave this up to you and Chet. At least the race is documented. Another problem, child_blocked_sigint can be false positive if the signal was sent to bash directly (not to pgrp). This means that the next ^C won't work again. And, > + if (r == -1 && errno == EINTR && wait_sigint_received) > + { > + child_blocked_sigint = 1; > + } This can't work afaics. waitchld() can never return -1 && EINTR. Perhaps waitchld() can set this flag, I don't know... 3087/* If waitpid returns 0, there are running children. If it returns -1, 3088 the only other error POSIX says it can return is EINTR. */ 3089CHECK_TERMSIG; 3090if (pid <= 0) 3091 continue; /* jumps right to the test */ The code looks strange btw. "jumps right to the test" is correct, but this code does do { ... } while ((sigchld || block == 0) && pid > (pid_t)0); and this "continue" in fact means "break". So, perhaps, we can do if (pid < 0) { if (wait_sigint_received) child_blocked_sigint = 1; break; } Oleg.
Re: [BUG] Bash not reacting to Ctrl-C
On 03/07, Chet Ramey wrote: > > > So I don't think my patch is really doing what it _intends_ to do. > > Let's take a step back and approach this a different way. Instead of > trying to intuit whether or not the child did anything with the SIGINT, > let's try to make the race condition smaller. OK, I'll try to test this patch later to see if it make the difference... At least the subjective difference. But, > The following patch is a very small change to jobs.c that makes > wait_sigint_handler only pay attention and set wait_sigint_received when > the shell is actually in waitpid() waiting for the child. It uses a > semaphore around the call to waitpid to effect that, with a little > bookkeeping and cleanup code. When the shell gets a SIGINT while not > actually waiting for a child, it restores the old handler and sends > SIGINT to itself. Hmm. It is very possible I do not understand the patch correctly. But doesn't this patch introduce another problem? > *** 3090,3096 > --- 3099,3107 > waitpid_flags |= WNOHANG; > } > > + waiting_for_child++; > pid = WAITPID (-1, &status, waitpid_flags); OK, and what if ^C comes before waiting_for_child++ ? IIUC, in this case bash exits and leaves the current application (say, emacs which threats SIGINT specially) alone, no? Oleg.
Re: [BUG] Bash not reacting to Ctrl-C
On 03/07, Chet Ramey wrote: > > > On 03/07, Chet Ramey wrote: > > > > > > *** 3090,3096 > > > --- 3099,3107 > > > waitpid_flags |= WNOHANG; > > > } > > > > > > + waiting_for_child++; > > > pid = WAITPID (-1, &status, waitpid_flags); > > > > OK, and what if ^C comes before waiting_for_child++ ? > > > > IIUC, in this case bash exits and leaves the current application > > (say, emacs which threats SIGINT specially) alone, no? > > Yes, it does. However, the same problem exists now. There is a window > between the time bash forks, the child execs, and bash waits when a SIGINT > can arrive and the same thing will happen. OK... I seem to understand make_child() blocks SIGINT, but at this point the signal handler is SIG_DFL. And then it forks and unblocks the signal without installing the handler. Thanks. I am just curious, is this another bug/problem or this was intended? Oleg.