Re: Named fifo's causing hanging bash scripts
On 1/12/15 9:55 AM, wer...@linux-8jdz.site wrote: > Bash Version: 4.3 > Patch Level: 33 > Release Status: release > > Description: > Named fifo's causing hanging bash scripts like > > while IFS="|" read a b c ; do > [shell code] > done < <(shell code) > > can cause random hangs of the bash.An strace shows that the bash > stays in wait4() I can't reproduce this. I spun up a VM running OpenSUSE 13 and ran the attached script against a version of bash-4.3.33 that was modified to use FIFOs instead of /dev/fd. There were no hangs in any of about 30 runs. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/
Re: [bug-bash] Named fifo's causing hanging bash scripts
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 1/13/15 4:29 AM, Dr. Werner Fink wrote: >>> Bash Version: 4.3 >>> Patch Level: 33 >>> Release Status: release >>> >>> Description: >>> Named fifo's causing hanging bash scripts like >>> >>> while IFS="|" read a b c ; do >>> [shell code] >>> done < <(shell code) >>> >>> can cause random hangs of the bash.An strace shows that the bash >>> stays in wait4() >> >> And when you attach to one of the hanging bash processes using gdb, what >> does the stack traceback look like? > > Yes (and sorry for the wrong email address as this was done on a clean > virtual sysstem) > > there are two hanging bash processes together with the find command: > > werner 19062 0.8 0.0 11864 2868 ttyS0S+ 10:21 0:00 bash -x > /tmp/brp-25-symlink > werner 19063 0.0 0.0 11860 1920 ttyS0S+ 10:21 0:00 bash -x > /tmp/brp-25-symlink > werner 19064 0.2 0.0 16684 2516 ttyS0S+ 10:21 0:00 find . -type > l -printf %p|%h|%l n > > the gdb -p 19062 and gdb -p 19063 show > > (gdb) bt > #0 0x7f530818a65c in waitpid () from /lib64/libc.so.6 > #1 0x0042b233 in waitchld (block=block@entry=1, wpid=19175) at > jobs.c:3235 > #2 0x0042c6da in wait_for (pid=pid@entry=19175) at jobs.c:2496 What do ps and gdb tell you about pid 19175 (and the corresponding pid in the call to waitchld in the other traceback)? Running, terminated, reaped, other? Chet - -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (Darwin) iEYEARECAAYFAlS5HqsACgkQu1hp8GTqdKuU5QCeKfuBQ4dYeU3fSjJPgtB+31Ep YPQAoIk8aeYkJWWcghPjYONgvyrE/qy9 =duRA -END PGP SIGNATURE-
Re: Named fifo's causing hanging bash scripts
On Fri, Jan 16, 2015 at 09:09:25AM -0500, Chet Ramey wrote: > On 1/12/15 9:55 AM, wer...@linux-8jdz.site wrote: > > > Bash Version: 4.3 > > Patch Level: 33 > > Release Status: release > > > > Description: > > Named fifo's causing hanging bash scripts like > > > > while IFS="|" read a b c ; do > > [shell code] > > done < <(shell code) > > > > can cause random hangs of the bash.An strace shows that the bash > > stays in wait4() > > I can't reproduce this. I spun up a VM running OpenSUSE 13 and ran the > attached script against a version of bash-4.3.33 that was modified to use > FIFOs instead of /dev/fd. There were no hangs in any of about 30 runs. Hmmm ... what I see is werner 10920 0.0 0.0 11860 2876 pts/1S+ 15:59 0:00 bash /tmp/brp-25-symlink werner 10921 0.0 0.0 11856 1844 pts/1S+ 15:59 0:00 bash /tmp/brp-25-symlink werner 10922 0.0 0.0 16684 2476 pts/1S+ 15:59 0:00 find . -type l -printf %p|%h|%l n d136:~ # ll /proc/10920/fd total 0 lr-x-- 1 werner suse 64 Jan 16 15:59 0 -> pipe:[124428] lrwx-- 1 werner suse 64 Jan 16 15:59 1 -> /dev/pts/1 lrwx-- 1 werner suse 64 Jan 16 15:59 10 -> /dev/pts/1 lrwx-- 1 werner suse 64 Jan 16 15:59 2 -> /dev/pts/1 lr-x-- 1 werner suse 64 Jan 16 15:59 255 -> /tmp/brp-25-symlink d136:~ # ll /proc/10921/fd total 0 lrwx-- 1 werner suse 64 Jan 16 15:59 0 -> /dev/pts/1 l-wx-- 1 werner suse 64 Jan 16 15:59 1 -> pipe:[124428] lrwx-- 1 werner suse 64 Jan 16 15:59 2 -> /dev/pts/1 ... but in the build there is [ 131s] checking for mkfifo... yes [ 150s] execute_cmd.c: In function 'execute_command_internal': [ 150s] execute_cmd.c:1034:12: warning: 'ofifo_list' may be used uninitialized in this function [-Wmaybe-uninitialized] [ 150s]free ((void *)ofifo_list); [ 150s] ^ and currently the bash43 is not usable for the OBS here. Also my personal chrootx script using <() for fiddling with xauth hangs upto Ctrl-C. Werner -- "Having a smoking section in a restaurant is like having a peeing section in a swimming pool." -- Edward Burr signature.asc Description: Digital signature
Re: Named fifo's causing hanging bash scripts
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 1/16/15 10:25 AM, Dr. Werner Fink wrote: > On Fri, Jan 16, 2015 at 09:09:25AM -0500, Chet Ramey wrote: >> On 1/12/15 9:55 AM, wer...@linux-8jdz.site wrote: >> >>> Bash Version: 4.3 >>> Patch Level: 33 >>> Release Status: release >>> >>> Description: >>> Named fifo's causing hanging bash scripts like >>> >>> while IFS="|" read a b c ; do >>> [shell code] >>> done < <(shell code) >>> >>> can cause random hangs of the bash.An strace shows that the bash >>> stays in wait4() >> >> I can't reproduce this. I spun up a VM running OpenSUSE 13 and ran the >> attached script against a version of bash-4.3.33 that was modified to use >> FIFOs instead of /dev/fd. There were no hangs in any of about 30 runs. > > Hmmm ... what I see is OK, but if I can't reproduce it, I can't investigate it. > > werner 10920 0.0 0.0 11860 2876 pts/1S+ 15:59 0:00 bash > /tmp/brp-25-symlink > werner 10921 0.0 0.0 11856 1844 pts/1S+ 15:59 0:00 bash > /tmp/brp-25-symlink > werner 10922 0.0 0.0 16684 2476 pts/1S+ 15:59 0:00 find . > -type l -printf %p|%h|%l n > > d136:~ # ll /proc/10920/fd > total 0 > lr-x-- 1 werner suse 64 Jan 16 15:59 0 -> pipe:[124428] > lrwx-- 1 werner suse 64 Jan 16 15:59 1 -> /dev/pts/1 > lrwx-- 1 werner suse 64 Jan 16 15:59 10 -> /dev/pts/1 > lrwx-- 1 werner suse 64 Jan 16 15:59 2 -> /dev/pts/1 > lr-x-- 1 werner suse 64 Jan 16 15:59 255 -> /tmp/brp-25-symlink > d136:~ # ll /proc/10921/fd > total 0 > lrwx-- 1 werner suse 64 Jan 16 15:59 0 -> /dev/pts/1 > l-wx-- 1 werner suse 64 Jan 16 15:59 1 -> pipe:[124428] > lrwx-- 1 werner suse 64 Jan 16 15:59 2 -> /dev/pts/1 > > ... but in the build there is > > [ 131s] checking for mkfifo... yes Sure, it's there, but if /dev/fd exists bash will prefer it. Since the VM I'm testing on has /dev/fd I had to manually edit config.h to disable it. > > [ 150s] execute_cmd.c: In function 'execute_command_internal': > [ 150s] execute_cmd.c:1034:12: warning: 'ofifo_list' may be used > uninitialized in this function [-Wmaybe-uninitialized] > [ 150s]free ((void *)ofifo_list); This isn't a useful warning. The `free' is only called if the saved_fifo flag is set, and that's only set if ofifo_list is initialized. - -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (Darwin) iEYEARECAAYFAlS5LscACgkQu1hp8GTqdKswzACeK333huO5pI5LF8DqiVxa/L2X ZlYAn0DfbeUiLGB2SEA/O8E/kLer7yNW =f4G5 -END PGP SIGNATURE-
Re: [bug-bash] Named fifo's causing hanging bash scripts
On Fri, Jan 16, 2015 at 09:22:36AM -0500, Chet Ramey wrote: > On 1/13/15 4:29 AM, Dr. Werner Fink wrote: > > >>> Bash Version: 4.3 > >>> Patch Level: 33 > >>> Release Status: release > >>> > >>> Description: > >>> Named fifo's causing hanging bash scripts like > >>> > >>> while IFS="|" read a b c ; do > >>> [shell code] > >>> done < <(shell code) > >>> > >>> can cause random hangs of the bash.An strace shows that the > >>> bash > >>> stays in wait4() > >> > >> And when you attach to one of the hanging bash processes using gdb, what > >> does the stack traceback look like? > > > > Yes (and sorry for the wrong email address as this was done on a clean > > virtual sysstem) > > > > there are two hanging bash processes together with the find command: > > > > werner 19062 0.8 0.0 11864 2868 ttyS0S+ 10:21 0:00 bash -x > > /tmp/brp-25-symlink > > werner 19063 0.0 0.0 11860 1920 ttyS0S+ 10:21 0:00 bash -x > > /tmp/brp-25-symlink > > werner 19064 0.2 0.0 16684 2516 ttyS0S+ 10:21 0:00 find . > > -type l -printf %p|%h|%l n > > > > the gdb -p 19062 and gdb -p 19063 show > > > > (gdb) bt > > #0 0x7f530818a65c in waitpid () from /lib64/libc.so.6 > > #1 0x0042b233 in waitchld (block=block@entry=1, wpid=19175) at > > jobs.c:3235 > > #2 0x0042c6da in wait_for (pid=pid@entry=19175) at jobs.c:2496 > > What do ps and gdb tell you about pid 19175 (and the corresponding pid in > the call to waitchld in the other traceback)? Running, terminated, reaped, > other? d136:~ # ps 10942 PID TTY STAT TIME COMMAND d136:~ # ... the process does not exists anymore. I guess that this could belong to the sed commands of the script. The other thread is showing d136: # ps 10922 PID TTY STAT TIME COMMAND 13177 pts/1S+ 0:00 find . -type l -printf %p|%h|%l n and the backtrace shows here 0x7fccae8d4860 in __write_nocancel () from /lib64/libc.so.6 #0 0x7fccae8d4860 in __write_nocancel () from /lib64/libc.so.6 #1 0x7fccae86f6b3 in _IO_new_file_write () from /lib64/libc.so.6 #2 0x7fccae86ed73 in new_do_write () from /lib64/libc.so.6 #3 0x7fccae8704e5 in __GI__IO_do_write () from /lib64/libc.so.6 #4 0x7fccae86fbe1 in __GI__IO_file_xsputn () from /lib64/libc.so.6 #5 0x7fccae8416e0 in vfprintf () from /lib64/libc.so.6 #6 0x7fccae8eec05 in __fprintf_chk () from /lib64/libc.so.6 #7 0x004106d5 in ?? () #8 0x0040a11b in ?? () #9 0x0040afa9 in ?? () #10 0x0040b0a6 in ?? () #11 0x00409bfe in ?? () #12 0x00409bfe in ?? () #13 0x00404199 in ?? () #14 0x00403911 in ?? () #15 0x7fccae81cb05 in __libc_start_main () from /lib64/libc.so.6 #16 0x004039dd in ?? () which IMHO could be related that output of find is not read anymore(?) > > Chet Werner -- "Having a smoking section in a restaurant is like having a peeing section in a swimming pool." -- Edward Burr signature.asc Description: Digital signature
Re: [bug-bash] Named fifo's causing hanging bash scripts
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 1/16/15 10:32 AM, Dr. Werner Fink wrote: > On Fri, Jan 16, 2015 at 09:22:36AM -0500, Chet Ramey wrote: >> On 1/13/15 4:29 AM, Dr. Werner Fink wrote: >> > Bash Version: 4.3 > Patch Level: 33 > Release Status: release > > Description: > Named fifo's causing hanging bash scripts like > > while IFS="|" read a b c ; do > [shell code] > done < <(shell code) > > can cause random hangs of the bash.An strace shows that the > bash > stays in wait4() And when you attach to one of the hanging bash processes using gdb, what does the stack traceback look like? >>> >>> Yes (and sorry for the wrong email address as this was done on a clean >>> virtual sysstem) >>> >>> there are two hanging bash processes together with the find command: >>> >>> werner 19062 0.8 0.0 11864 2868 ttyS0S+ 10:21 0:00 bash -x >>> /tmp/brp-25-symlink >>> werner 19063 0.0 0.0 11860 1920 ttyS0S+ 10:21 0:00 bash -x >>> /tmp/brp-25-symlink >>> werner 19064 0.2 0.0 16684 2516 ttyS0S+ 10:21 0:00 find . >>> -type l -printf %p|%h|%l n >>> >>> the gdb -p 19062 and gdb -p 19063 show >>> >>> (gdb) bt >>> #0 0x7f530818a65c in waitpid () from /lib64/libc.so.6 >>> #1 0x0042b233 in waitchld (block=block@entry=1, wpid=19175) at >>> jobs.c:3235 >>> #2 0x0042c6da in wait_for (pid=pid@entry=19175) at jobs.c:2496 >> >> What do ps and gdb tell you about pid 19175 (and the corresponding pid in >> the call to waitchld in the other traceback)? Running, terminated, reaped, >> other? > > d136:~ # ps 10942 > PID TTY STAT TIME COMMAND > d136:~ # > > ... the process does not exists anymore. I guess that this could belong to > the sed commands of the script. This is why I need to be able to reproduce it. If the process got reaped, when would it have happened and why would the call to wait_for() have found a valid CHILD struct for it? The whole loop runs with SIGCHLD blocked, so it's not as if the signal handler could have reaped the child out from under it. I have questions but no way to find answers. - -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (Darwin) iEYEARECAAYFAlS5MjoACgkQu1hp8GTqdKvN5ACeK9XEiIQ1glUHC4hEF3ZTKJjL dUkAoI6nnxKypXP3MFns6/TyaOHNmHL5 =x3Ck -END PGP SIGNATURE-
Re: [bug-bash] Named fifo's causing hanging bash scripts
On Fri, Jan 16, 2015 at 10:46:02AM -0500, Chet Ramey wrote: > >> > >> What do ps and gdb tell you about pid 19175 (and the corresponding pid in > >> the call to waitchld in the other traceback)? Running, terminated, reaped, > >> other? > > > > d136:~ # ps 10942 > > PID TTY STAT TIME COMMAND > > d136:~ # > > > > ... the process does not exists anymore. I guess that this could belong to > > the sed commands of the script. > > This is why I need to be able to reproduce it. If the process got reaped, > when would it have happened and why would the call to wait_for() have > found a valid CHILD struct for it? The whole loop runs with SIGCHLD > blocked, so it's not as if the signal handler could have reaped the > child out from under it. I have questions but no way to find answers. OK, thanks for your effort ... I've strip the spec file down step by step and reached success at commenting out -DMUST_UNBLOCK_CHLD=1 (mea culpa) ... many thanks for your help! Werner -- "Having a smoking section in a restaurant is like having a peeing section in a swimming pool." -- Edward Burr signature.asc Description: Digital signature
Re: [bug-bash] Named fifo's causing hanging bash scripts
Dr. Fink, Have you tried getting rid of the stderr redirect on your find command to make sure find isn't showing any errors? If you eliminate most of the inside of your while loop, does it still hang? For example: while IFS="|" read link link_dir link_dest; do echo "$link,$link_dir,$link_dest" done < <(find . -type l -printf '%p|%h|%l\n' 2>/dev/null) -Jonathan Hankins On Fri, Jan 16, 2015 at 9:46 AM, Chet Ramey wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > On 1/16/15 10:32 AM, Dr. Werner Fink wrote: > > On Fri, Jan 16, 2015 at 09:22:36AM -0500, Chet Ramey wrote: > >> On 1/13/15 4:29 AM, Dr. Werner Fink wrote: > >> > > Bash Version: 4.3 > > Patch Level: 33 > > Release Status: release > > > > Description: > > Named fifo's causing hanging bash scripts like > > > > while IFS="|" read a b c ; do > > [shell code] > > done < <(shell code) > > > > can cause random hangs of the bash.An strace shows that > the bash > > stays in wait4() > > And when you attach to one of the hanging bash processes using gdb, > what > does the stack traceback look like? > >>> > >>> Yes (and sorry for the wrong email address as this was done on a clean > virtual sysstem) > >>> > >>> there are two hanging bash processes together with the find command: > >>> > >>> werner 19062 0.8 0.0 11864 2868 ttyS0S+ 10:21 0:00 bash > -x /tmp/brp-25-symlink > >>> werner 19063 0.0 0.0 11860 1920 ttyS0S+ 10:21 0:00 bash > -x /tmp/brp-25-symlink > >>> werner 19064 0.2 0.0 16684 2516 ttyS0S+ 10:21 0:00 find > . -type l -printf %p|%h|%l n > >>> > >>> the gdb -p 19062 and gdb -p 19063 show > >>> > >>> (gdb) bt > >>> #0 0x7f530818a65c in waitpid () from /lib64/libc.so.6 > >>> #1 0x0042b233 in waitchld (block=block@entry=1, wpid=19175) > at jobs.c:3235 > >>> #2 0x0042c6da in wait_for (pid=pid@entry=19175) at > jobs.c:2496 > >> > >> What do ps and gdb tell you about pid 19175 (and the corresponding pid > in > >> the call to waitchld in the other traceback)? Running, terminated, > reaped, > >> other? > > > > d136:~ # ps 10942 > > PID TTY STAT TIME COMMAND > > d136:~ # > > > > ... the process does not exists anymore. I guess that this could belong > to > > the sed commands of the script. > > This is why I need to be able to reproduce it. If the process got reaped, > when would it have happened and why would the call to wait_for() have > found a valid CHILD struct for it? The whole loop runs with SIGCHLD > blocked, so it's not as if the signal handler could have reaped the > child out from under it. I have questions but no way to find answers. > > > - -- > ``The lyf so short, the craft so long to lerne.'' - Chaucer > ``Ars longa, vita brevis'' - Hippocrates > Chet Ramey, ITS, CWRUc...@case.edu > http://cnswww.cns.cwru.edu/~chet/ > -BEGIN PGP SIGNATURE- > Version: GnuPG v1.4.11 (Darwin) > > iEYEARECAAYFAlS5MjoACgkQu1hp8GTqdKvN5ACeK9XEiIQ1glUHC4hEF3ZTKJjL > dUkAoI6nnxKypXP3MFns6/TyaOHNmHL5 > =x3Ck > -END PGP SIGNATURE- > > -- Jonathan HankinsHomewood City Schools The simplest thought, like the concept of the number one, has an elaborate logical underpinning. - Carl Sagan jhank...@homewood.k12.al.us