Test exit status misinterpreted in scripts when buit without job control
Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-unknown-linux-gnu' -DCONF_VENDOR='unknown' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -I. -I.. -I../include -I../lib -g -O2 uname output: Linux spitfire.my.domain 3.13.0-88-generic #135-Ubuntu SMP Wed Jun 8 21:10:42 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Machine Type: x86_64-unknown-linux-gnu Bash Version: 4.3 Patch Level: 30 Release Status: release Description: When bash is built without job control, shell scripts that use the 'test' builtin (e.g., via '[') in conditionals may take the wrong branch becuase the exit status of the test is lost. Repeat-By: Configure without job control. Via e.g., ./configure --prefix=/usr --bindir=/bin --without-bash-malloc --disable-nls --disable-job-control Invoke the resulting shell and run the following sequence of commands: $ cat > foo.sh if [ $# -lt 2 ] then echo "$# args is less than 2" else echo "$# args is not less than 2" fi $ chmod +x ./foo.sh $ ./foo.sh 1 2 3 4 4 args is less than 2 $ Observe the output: '4' is not actually less than '2' yet the script incorrectly reports it as such. Note: we originally discovered this when porting 'bash' to a new research operating system that does not support job control. However, we were able to reproduce on Linux.
Re: Test exit status misinterpreted in scripts when buit without job control
On 8/4/16 12:05 PM, Dan Cross wrote: > Bash Version: 4.3 > Patch Level: 30 > Release Status: release > > Description: > When bash is built without job control, shell scripts that use > the 'test' builtin (e.g., via '[') in conditionals may take the > wrong branch becuase the exit status of the test is lost. > > Repeat-By: > Configure without job control. Via e.g., > ./configure --prefix=/usr --bindir=/bin --without-bash-malloc > --disable-nls --disable-job-control > Invoke the resulting shell and run the following sequence of commands: > > $ cat > foo.sh > if [ $# -lt 2 ] > then > echo "$# args is less than 2" > else > echo "$# args is not less than 2" > fi > $ chmod +x ./foo.sh > $ ./foo.sh 1 2 3 4 > 4 args is less than 2 > $ > > Observe the output: '4' is not actually less than '2' yet the > script incorrectly reports it as such. Thanks for the report. I took a quick look at this, and it's not disabling job control that does it: it's disabling both job control and nls. Disabling either one while leaving the other enabled doesn't produce this error (which only happens in the case where you run a script with the execute bit set without a #! line after running an executable that causes the shell to call waitpid()). It's a strange set of circumstances. I'll see what I can find. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/
Re: Test exit status misinterpreted in scripts when buit without job control
On Thu, Aug 4, 2016 at 2:36 PM, Chet Ramey wrote: > On 8/4/16 12:05 PM, Dan Cross wrote: > > Bash Version: 4.3 > > Patch Level: 30 > > Release Status: release > > > > Description: > > When bash is built without job control, shell scripts that use > > the 'test' builtin (e.g., via '[') in conditionals may take the > > wrong branch becuase the exit status of the test is lost. > > > > Repeat-By: > > Configure without job control. Via e.g., > > ./configure --prefix=/usr --bindir=/bin --without-bash-malloc > --disable-nls --disable-job-control > > Invoke the resulting shell and run the following sequence of > commands: > > > > $ cat > foo.sh > > if [ $# -lt 2 ] > > then > > echo "$# args is less than 2" > > else > > echo "$# args is not less than 2" > > fi > > $ chmod +x ./foo.sh > > $ ./foo.sh 1 2 3 4 > > 4 args is less than 2 > > $ > > > > Observe the output: '4' is not actually less than '2' yet the > > script incorrectly reports it as such. > > Thanks for the report. I took a quick look at this, and it's not disabling > job control that does it: it's disabling both job control and nls. > Disabling either one while leaving the other enabled doesn't produce this > error (which only happens in the case where you run a script with the > execute bit set without a #! line after running an executable that causes > the shell to call waitpid()). It's a strange set of circumstances. > I'll see what I can find. > Thanks, Chet. FYI, I tried building for the research kernel with NLS enabled and am still seeing the problem. Our patch is pretty minimal (mostly just adding the name of the OS as supported in the various configure scripts, and we have a requirement that strings written using 'echo' get written with one system call, so I bypass stdio for that. Oh, and we have another context string in addition to errno that we print on errors). Also, I was able to reproduce on an unpatched bash on Linux with NLS enabled: % ../configure --prefix=/usr --bindir=/bin --without-bash-malloc --disable-job-control % grep NLS config.h #define ENABLE_NLS 1 % make (build output omitted for brevity) % ./bash --noprofile --norc $ ./foo.sh 1 2 3 4 4 args is not less than 2 $ ./foo.sh 1 2 3 4 4 args is less than 2 $ exit % Thanks again! - Dan C. (PS: If you're curious, we're porting bash to the Akaros operating system: http://akaros.org/)