Re: [BUG] Bash not reacting to Ctrl-C

2011-02-09 Thread Oleg Nesterov
On 02/08, Chet Ramey wrote:
>
> On 2/8/11 4:17 PM, Oleg Nesterov wrote:
>
> > Once again. If bash gets ^C and at the same time the current foreground
> > child exits normally (either because this jctl signal races with exit()
> > or because the child hooks SIGINT and exits after that) SIGINT is lost.
> >
> > set_job_status_and_cleanup() insists that WTERMSIG(child->status) should
> > be SIGINT, iow the child should be killed by the same signal. Otherwise
> > it is not going to kill itself, and the next wait_for() clears
> > wait_sigint_received.
> >
> > This all looks intentional, but this means ^C can never work reliably.
>
> It depends on what you mean by `reliably'.

Sure, I understand that it is not that simple.

> Consider a script that runs
> emacs, then does other processing when emacs completes.  Emacs uses SIGINT
> internally to interrupt editing commands, but handles it and does not exit
> as a result.  Since emacs is run from a script, and job control is not
> enabled, the shell receives the SIGINT also, because it is in the
> terminal's foreground process group.  Should the shell abort the script
> when emacs exits?

In my opinion - it should. But yes, I know almost nothing about jctl
(at least the non-kernel part), and I agree this behaviour can confuse
a user too.

That is why I provided another test-case, let me repeat it:

#!./bash

perl -we '$SIG{INT} = sub {exit}; sleep'

echo "Hehe, I am going to sleep after ^C"
sleep 100

If a user presses ^C the shell can't know what he wants, kill the
script or send the signal to the current job.

However. I think the shell should react and exit. Exactly because it
runs in the same foreground process group. If the user doesn't want
this behaviour he can change the script, say,

#!./bash

trap true SIGINT
perl -we '$SIG{INT} = sub {exit}; sleep'
trap - SIGINT

echo "OK, WCE mode makes sense sometime"
sleep 100

Better yet, perhaps bash can have the new command/builtin which does
setpgid() and TIOCSPGRP before running the command.

Oleg.




Re: [BUG] Bash not reacting to Ctrl-C

2011-02-09 Thread Oleg Nesterov
On 02/08, Chet Ramey wrote:
>
> On 2/8/11 7:11 PM, Ingo Molnar wrote:
> >
> > Oleg also found another simple testcase i think - and Thomas (Cc:-ed) 
> > reported
> > similar Ctrl-C problems with Bash as well.
>
> I tried to reproduce it and wasn't able to.  I use Mac OS X.

Strange, but I know nothing about Mac OS...

Hmm. Do you mean the "perl -e" test-case doesn't work too ? Did you
try other test-cases from http://marc.info/?l=linux-kernel&m=129623373208782
(this message was attached) ?

OK, another test-case,

#!./bash

perl -we 'kill INT, getppid'

echo "Hehe, I am going to sleep after ^C"
sleep 100

("perl -e" just sends SIGINT to the parent)

To clarify, I do not claim this particular case "proves" the shell
is buggy. Just to illustrate the problem: the shell refuses to exit
unless the child was killed by SIGINT too.

Oleg.




Re: [BUG] Bash not reacting to Ctrl-C

2011-02-09 Thread Ingo Molnar

* Oleg Nesterov  wrote:

> That is why I provided another test-case, let me repeat it:
> 
>   #!./bash
> 
>   perl -we '$SIG{INT} = sub {exit}; sleep'
> 
>   echo "Hehe, I am going to sleep after ^C"
>   sleep 100

This reliably reproduces the (formerly sporadic) script Ctrl-C bug here,
100% of the time:

 aldebaran:~> ./test-signal-perl.sh 
 ^CHehe, I am going to sleep after ^C

 [ it waits 100 seconds ]

Thanks,

Ingo



miscompilation at gcc -O2

2011-02-09 Thread Eric Blake
Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64'
-DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-redhat-linux-gnu'
-DCONF_VENDOR='redhat' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash'
-DSHELL -DHAVE_CONFIG_H   -I.  -I. -I./include -I./lib  -D_GNU_SOURCE
-DRECYCLES_PIDS  -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
-fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic
uname output: Linux office 2.6.35.10-74.fc14.x86_64 #1 SMP Thu Dec 23
16:04:50 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux
Machine Type: x86_64-redhat-linux-gnu


Bash Version: 4.1
Patch Level: 7
Release Status: release

Description:
There is a report of bash being miscompiled for cygwin when using gcc
4.3.4 -O2, but succeeding when compiled with -O1:
http://cygwin.com/ml/cygwin/2011-02/msg00230.html

Compiling with -Wextra reveals the culprit:
execute_cmd.c: In function ‘execute_function.clone.2’:
execute_cmd.c:4007:23: warning: variable ‘bash_source_a’ might be
clobbered by ‘longjmp’ or ‘vfork’
execute_cmd.c:4007:39: warning: variable ‘bash_lineno_a’ might be
clobbered by ‘longjmp’ or ‘vfork’
execute_cmd.c: In function ‘execute_in_subshell’:
execute_cmd.c:1296:12: warning: variable ‘tcom’ might be clobbered by
‘longjmp’ or ‘vfork’

POSIX is clear that the value of an automatic variable changed between
setjmp() and the subsequent longjmp() is unspecified unless the variable
is marked volatile, but bash is violating this constraint and modifying
several variables that cannot reliably be restored.  Depending on what
code transformations the compiler makes, this can lead to crashes; in
cygwin's case, it appears that mere execution of a trap return handler
can cause bash to corrupt its own stack.

Repeat-By:
make
rm execute_cmd.o
make CFLAGS='-Wextra -O2'

Fix:
--- execute_cmd.c.orig  2011-02-09 11:53:13.470850670 -0700
+++ execute_cmd.c   2011-02-09 11:53:48.422939088 -0700
@@ -1293,7 +1293,7 @@
   int user_subshell, return_code, function_value, should_redir_stdin,
invert;
   int ois, user_coproc;
   int result;
-  COMMAND *tcom;
+  COMMAND *volatile tcom;

   USE_VAR(user_subshell);
   USE_VAR(user_coproc);
@@ -4004,7 +4004,7 @@
   char *debug_trap, *error_trap, *return_trap;
 #if defined (ARRAY_VARS)
   SHELL_VAR *funcname_v, *nfv, *bash_source_v, *bash_lineno_v;
-  ARRAY *funcname_a, *bash_source_a, *bash_lineno_a;
+  ARRAY *funcname_a, *volatile bash_source_a, *volatile bash_lineno_a;
 #endif
   FUNCTION_DEF *shell_fn;
   char *sfile, *t;


-- 
Eric Blake   ebl...@redhat.com+1-801-349-2682
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [BUG] Bash not reacting to Ctrl-C

2011-02-09 Thread Bob Proulx
Oleg Nesterov wrote:
> Bob Proulx wrote:
> > Is the behavior you observe any different for this case?
> >   $ bash -c 'while true; do /bin/true || exit 1; done'
> > Or different for this case?
> >   $ bash -e -c 'while true; do /bin/true; done'
> 
> The same.

I expected that to behave differently for you because I expected that
the issue was that /bin/true was being delivered the signal but the
exit status of /bin/true is being ignored in your test case.  In your
test case if /bin/true caught the SIGINT then I expect the loop to
continue.  Since you were saying that it was continuing then that is
what I was expecting was happening.

> I do not know what "-e" does (and I can't find it in man), but how
> this can make a difference?

The documentation says this about -e:

  -e  Exit immediately if a pipeline (which may consist
  of a single simple command), a subshell command
  enclosed in parentheses, or one of the commands
  executed as part of a command list enclosed by
  braces (see SHELL GRAMMAR above) exits with a
  non-zero status.  The shell does not exit if the
  command that fails is part of the command list
  immediately following a while or until keyword,
  part of the test following the if or elif
  reserved words, part of any command executed in
  a && or list except the command following the
  final && or, any command in a pipeline but the
  last, or if the command's return value is being
  inverted with !.  A trap on ERR, if set, is
  executed before the shell exits.  This option
  applies to the shell environment and each
  subshell environment separately (see COMMAND
  EXECUTION ENVIRONMENT above), and may cause
  subshells to exit before executing all the
  commands in the subshell.

Using -e would cause the shell to exit if /bin/true returned a
non-zero exit status.  /bin/true would exit non-zero if it caught a
SIGINT signal.

Bob



Re: [BUG] Bash not reacting to Ctrl-C

2011-02-09 Thread Bob Proulx
Ingo Molnar wrote:
> Could you try the reproducer please?
> 
> Once you run it, try to stop it via Ctrl-C, and try to do this a
> couple of times.

I was not able to reproduce your problem using your (I believe to be
slightly incorrect) test case:

  bash -c 'while true; do /bin/true; done'

It was always interrupted with a single control-C on my amd64 Debian
Squeeze machine.  I expect this means that by chance it was always
bash running in the foreground process and /bin/true never happened to
be there at the right time.

> Do you consider it normal that it often takes 2-3 Ctrl-C attempts to
> interrupt that script, that it is not possible to stop the script
> reliably with a single Ctrl-C?

Since the exit status of /bin/true is ignored then I think that test
case is flawed.  I think at the least needs to check the exit status
of the /bin/true process.

  bash -c 'while true; do /bin/true || exit 1; done'

Bob



Re: [BUG] Bash not reacting to Ctrl-C

2011-02-09 Thread Bob Proulx
Oleg Nesterov wrote:
> That is why I provided another test-case, let me repeat it:

Sorry but I missed seeing that the first time through or I would have
commented.

>   #!./bash
>   perl -we '$SIG{INT} = sub {exit}; sleep'
>   echo "Hehe, I am going to sleep after ^C"
>   sleep 100

This test case is flawed in that as written perl will eat the signal
and ignore it.  It isn't fair to explicitly ignore the signal.

Instead try this improved test case with corrected signal handling.

  #!/bin/bash
  perl -we '$SIG{INT}=sub{$SIG{INT}="DEFAULT";kill(INT,$$);}; sleep' || exit 1
  echo "Hehe, I am going to sleep after ^C"
  sleep 100
  exit(0);

Does this get interrupted after one SIGINT now that it isn't being
caught and ignored?

To be clear I am simply trying to make sure the test cases are not
themselves creating the problem.

Bob



Re: [BUG] Bash not reacting to Ctrl-C

2011-02-09 Thread Oleg Nesterov
On 02/09, Bob Proulx wrote:
>
> Oleg Nesterov wrote:
> > Bob Proulx wrote:
> > > Is the behavior you observe any different for this case?
> > >   $ bash -c 'while true; do /bin/true || exit 1; done'
> > > Or different for this case?
> > >   $ bash -e -c 'while true; do /bin/true; done'
> >
> > The same.
>
> I expected that to behave differently for you because I expected that
> the issue was that /bin/true was being delivered the signal but the
> exit status of /bin/true is being ignored in your test case.  In your
> test case if /bin/true caught the SIGINT then I expect the loop to
> continue.  Since you were saying that it was continuing then that is
> what I was expecting was happening.

Well, it is too late for me ;) perhaps I misunderstood your point.
But I think this doesn't matter, see below.

> > I do not know what "-e" does (and I can't find it in man), but how
> > this can make a difference?
>
> The documentation says this about -e:
>
> [... snip ...]

Aha, thanks a lot.

> Using -e would cause the shell to exit if /bin/true returned a
> non-zero exit status.  /bin/true would exit non-zero if it caught a
> SIGINT signal.

If /bin/true gets SIGINT - everything is fine. With this particular
test-case the problem is: ^C race race with true/false/whatever
doing exit(any_exit_code). In this case the shell "ignores" the
signal.

Oleg.




Re: [BUG] Bash not reacting to Ctrl-C

2011-02-09 Thread Oleg Nesterov
On 02/09, Bob Proulx wrote:
>
> Ingo Molnar wrote:
> > Could you try the reproducer please?
> >
> > Once you run it, try to stop it via Ctrl-C, and try to do this a
> > couple of times.
>
> I was not able to reproduce your problem using your (I believe to be
> slightly incorrect) test case:
>
>   bash -c 'while true; do /bin/true; done'
>
> It was always interrupted with a single control-C on my amd64 Debian
> Squeeze machine.  I expect this means that by chance it was always
> bash running in the foreground process and /bin/true never happened to
> be there at the right time.
>
> > Do you consider it normal that it often takes 2-3 Ctrl-C attempts to
> > interrupt that script, that it is not possible to stop the script
> > reliably with a single Ctrl-C?
>
> Since the exit status of /bin/true is ignored then I think that test
> case is flawed.  I think at the least needs to check the exit status
> of the /bin/true process.
>
>   bash -c 'while true; do /bin/true || exit 1; done'

Perhaps I misread job.c (this is very posible). But afaics bash
always checks "status" after waitpid(&status), and the exit code
does not matter at all. What does matter is whether WIFSIGNALED()
and WTERMSIG() == SIGINT or not.

Oleg.




Re: [BUG] Bash not reacting to Ctrl-C

2011-02-09 Thread Oleg Nesterov
On 02/09, Bob Proulx wrote:
>
> Oleg Nesterov wrote:
>
> > That is why I provided another test-case, let me repeat it:
>
> Sorry but I missed seeing that the first time through or I would have
> commented.
>
> > #!./bash
> > perl -we '$SIG{INT} = sub {exit}; sleep'
> > echo "Hehe, I am going to sleep after ^C"
> > sleep 100
>
> This test case is flawed in that as written perl will eat the signal
> and ignore it.  It isn't fair to explicitly ignore the signal.

Sure! But you misunderstood. This test-case does not try to prove that
bash is buggy. Quite contrary, I created it exactly because I started
to suspect that the current behaviour is probably intentional, at least
partly.

And, it illustrates how and why the test-case with /bin/true can miss
a signal. Because, from /bin/sh pov "eat the signal and exit" does not
differ from another case: ^C races with do_exit().

Oleg.




Re: miscompilation at gcc -O2

2011-02-09 Thread Jon Seymour
Good catch - how long did that take to find?

jon.

On Thu, Feb 10, 2011 at 6:06 AM, Eric Blake  wrote:
> Configuration Information [Automatically generated, do not change]:
> Machine: x86_64
> OS: linux-gnu
> Compiler: gcc
> Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64'
> -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-redhat-linux-gnu'
> -DCONF_VENDOR='redhat' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash'
> -DSHELL -DHAVE_CONFIG_H   -I.  -I. -I./include -I./lib  -D_GNU_SOURCE
> -DRECYCLES_PIDS  -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
> -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic
> uname output: Linux office 2.6.35.10-74.fc14.x86_64 #1 SMP Thu Dec 23
> 16:04:50 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux
> Machine Type: x86_64-redhat-linux-gnu
>
>
> Bash Version: 4.1
> Patch Level: 7
> Release Status: release
>
> Description:
> There is a report of bash being miscompiled for cygwin when using gcc
> 4.3.4 -O2, but succeeding when compiled with -O1:
> http://cygwin.com/ml/cygwin/2011-02/msg00230.html
>
> Compiling with -Wextra reveals the culprit:
> execute_cmd.c: In function ‘execute_function.clone.2’:
> execute_cmd.c:4007:23: warning: variable ‘bash_source_a’ might be
> clobbered by ‘longjmp’ or ‘vfork’
> execute_cmd.c:4007:39: warning: variable ‘bash_lineno_a’ might be
> clobbered by ‘longjmp’ or ‘vfork’
> execute_cmd.c: In function ‘execute_in_subshell’:
> execute_cmd.c:1296:12: warning: variable ‘tcom’ might be clobbered by
> ‘longjmp’ or ‘vfork’
>
> POSIX is clear that the value of an automatic variable changed between
> setjmp() and the subsequent longjmp() is unspecified unless the variable
> is marked volatile, but bash is violating this constraint and modifying
> several variables that cannot reliably be restored.  Depending on what
> code transformations the compiler makes, this can lead to crashes; in
> cygwin's case, it appears that mere execution of a trap return handler
> can cause bash to corrupt its own stack.
>
> Repeat-By:
> make
> rm execute_cmd.o
> make CFLAGS='-Wextra -O2'
>
> Fix:
> --- execute_cmd.c.orig  2011-02-09 11:53:13.470850670 -0700
> +++ execute_cmd.c       2011-02-09 11:53:48.422939088 -0700
> @@ -1293,7 +1293,7 @@
>   int user_subshell, return_code, function_value, should_redir_stdin,
> invert;
>   int ois, user_coproc;
>   int result;
> -  COMMAND *tcom;
> +  COMMAND *volatile tcom;
>
>   USE_VAR(user_subshell);
>   USE_VAR(user_coproc);
> @@ -4004,7 +4004,7 @@
>   char *debug_trap, *error_trap, *return_trap;
>  #if defined (ARRAY_VARS)
>   SHELL_VAR *funcname_v, *nfv, *bash_source_v, *bash_lineno_v;
> -  ARRAY *funcname_a, *bash_source_a, *bash_lineno_a;
> +  ARRAY *funcname_a, *volatile bash_source_a, *volatile bash_lineno_a;
>  #endif
>   FUNCTION_DEF *shell_fn;
>   char *sfile, *t;
>
>
> --
> Eric Blake   ebl...@redhat.com    +1-801-349-2682
> Libvirt virtualization library http://libvirt.org
>
>