Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' - DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-unknown-linux-gnu' - DCONF_VENDOR='unknown' -DLOCALEDIR='/home/trnka/opt/share/locale' - DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -I. -I. -I./include -I./lib -g - O2 uname output: Linux a324-2 2.6.24.2 #1 SMP Wed Feb 20 12:36:17 CET 2008 x86_64 GNU/Linux Machine Type: x86_64-unknown-linux-gnu
Bash Version: 4.1 Patch Level: 2 Release Status: release Description: I've started using coprocesses heavily and I've found a nasty problem related (but not limited) to them: After the coprocess finishes its job, the resultant SIGCHLD is not properly blocked by bash signal processing logic and interferes with script I/O. In my case, I've been using something like: read var1 var2 < <( a | long | pipeline | here) echo "var1=$var1" echo "var2=$var2" Sometimes, the SIGCHLD arrived just when one of the echos were doing output and the result was: echo: write error: Interrupted system call As this is a bit of a race, it occurs only when the stars are right, i.e. during normal usage the probability of the SIGCHLD hitting exactly the echo is quite low. However, as soon as anything causes the I/O to take significantly longer, the bug appears. I've been hitting quite often (30%?) when running the script over SSH. This bug has probably been reported years ago here: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=382798 Repeat-By: I've reduced one of my scripts to this (nothing exceptionally intelligent, but it does the job): #!/bin/bash while [[ 1 ]]; do set +e read tmp tmp2 < <( echo "blabla" | wc | tr -s " " "\n" | tail -n 2 | tr "\n" " ") set -e echo $tmp echo $tmp2 echo $tmp echo $tmp2 echo $tmp echo $tmp2 echo $tmp echo $tmp2 echo $tmp echo $tmp2 echo $tmp echo $tmp2 echo $tmp echo $tmp2 echo $tmp echo $tmp2 echo $tmp echo $tmp2 done Using this script I can reliably reproduce the bug (i.e. get a Interrupted system call error) using bash 4.1.2 (compiled myself from vanilla tarball) and 3.1.17 (Debian lenny) over SSH and 4.0.35 (stock Fedora 12) under strace. Fix: Applying the following simple patch (against 4.1.2) fixes the bug: --- builtins/echo.def.orig 2010-03-24 19:40:54.000000000 +0100 +++ builtins/echo.def 2010-03-24 19:47:07.000000000 +0100 @@ -27,6 +27,7 @@ #include "../bashansi.h" +#include <signal.h> #include <stdio.h> #include "../shell.h" @@ -108,6 +109,7 @@ { int display_return, do_v9, i, len; char *temp, *s; + sigset_t nmask, omask; do_v9 = xpg_echo; display_return = 1; @@ -159,6 +161,10 @@ clearerr (stdout); /* clear error before writing and testing success */ + sigemptyset(&nmask); + sigaddset(&nmask, SIGCHLD); + sigprocmask(SIG_BLOCK, &nmask, &omask); + terminate_immediately++; while (list) { @@ -193,6 +199,8 @@ if (display_return) putchar ('\n'); + sigprocmask(SIG_SETMASK, &omask, NULL); + terminate_immediately--; return (sh_chkwrite (EXECUTION_SUCCESS)); }