Re: bash randomly logs out on VT / serial console

2017-01-02 Thread Russell King
On Sun, Jan 01, 2017 at 09:14:45PM -0500, Chet Ramey wrote:
> On 1/1/17 4:01 PM, r...@armlinux.org.uk wrote:
> > Bash Version: 4.3
> > Patch Level: 30
> > Release Status: release
> > 
> > Description:
> > Running:
> > while :; do s=$(sleep .005 | cat); echo $s; done | uniq
> > in a login shell on a virtual terminal or serial console results
> > in the shell randomly logging out after ^C.
> 
> I believe this is fixed in bash-4.4.

Thank you for your reply.

However, while testing bash-4.4 with the five additional patches (bringing
it to 4.4.5) shows that it is not fixed in bash-4.4 - see below.  The line
from __tty_check_change() is from the debug I added to the kernel, which
shows that the reason for the EIO error is because the tty pgrp doesn't
match the process' pgrp.

# ps -ef | grep ttyS0
root 29257 1  0 10:44 ttyS000:00:00 /bin/login --
root 29293 29257  0 10:45 ttyS000:00:00 -bash

root@mcbin:~# while :; do s=$(sleep .005 | cat); echo $s; done | uniq

^C
root@mcbin:~# while :; do s=$(sleep .005 | cat); echo $s; done | uniq

^C
root@mcbin:~# __tty_check_change: tty=ffc0e74bd800 current 
pgrp=29293(ffc0e795fc00) tty pgrp=29257(ffc0e795f080)
logout

-- 
Russell King



Re: [Help-bash] Bash Trap: How to Get Line Number of a Subprocess with Non-Zero Status

2017-01-02 Thread Steve Amerige

  
  

On 12/30/2016 12:50 PM, Chet Ramey
  wrote:


  It's the difference between a simple command, which has a line number, and
a subshell command, which does not.  The subshell command's line number is
derived from the execution context, which is relative to the start of the
function.  The OP wants the subshell's line number to be absolute, like the
simple command's, and not relative to the start of the function, as it is
now.


From an end-user's perspective, it is unexpected that the LINENO has
the value of the
beginning of the function instead of the line at which the ERR was
caught by trap.
While the "inside" of the subshell might have it's own
characteristics, the inside must
eventually meet the "outside" of the subshell... at which point, an
end user can reasonably
expect that an error is reported in a common-sense way with the line
number that is
seen in the script file itself. While the current behavior can be
explained, that doesn't
mean it is expected by end users or makes sense.

Using any other line number for LINENO also makes it harder to debug
scripts, of course,
since the line number being reported isn't as helpful.

Is there any way this can be fixed... or introduced as a feature
(e.g., with some
set -o or shopt flag)?  A fix would help out a significant number of
people. As it is, a
workaround now is to use:

    $(...)

instead of:

    (...)

to get the expected behavior. But, this can mean some significant
re-writing of code.

Enjoy,
Steve Amerige
Server Science Incorporated
Eggsh:
A Powerful and Reusable Bash Scripting Platform, an Open-Source
Project


  




Re: [Help-bash] Bash Trap: How to Get Line Number of a Subprocess with Non-Zero Status

2017-01-02 Thread Chet Ramey
On 1/2/17 4:59 AM, Steve Amerige wrote:

> From an end-user's perspective, it is unexpected that the LINENO has the
> value of the
> beginning of the function instead of the line at which the ERR was caught
> by trap.

Sure, it's missing functionality.  Eduardo's patch was pretty much spot-on.
It will be fixed in the next version of bash, at least.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: bash randomly logs out on VT / serial console

2017-01-02 Thread Chet Ramey
On 1/2/17 6:09 AM, Russell King wrote:
> On Sun, Jan 01, 2017 at 09:14:45PM -0500, Chet Ramey wrote:
>> On 1/1/17 4:01 PM, r...@armlinux.org.uk wrote:
>>> Bash Version: 4.3
>>> Patch Level: 30
>>> Release Status: release
>>>
>>> Description:
>>> Running:
>>> while :; do s=$(sleep .005 | cat); echo $s; done | uniq
>>> in a login shell on a virtual terminal or serial console results
>>> in the shell randomly logging out after ^C.
>>
>> I believe this is fixed in bash-4.4.
> 
> Thank you for your reply.
> 
> However, while testing bash-4.4 with the five additional patches (bringing
> it to 4.4.5) shows that it is not fixed in bash-4.4 - see below.  The line
> from __tty_check_change() is from the debug I added to the kernel, which
> shows that the reason for the EIO error is because the tty pgrp doesn't
> match the process' pgrp.

OK.  I can't reproduce it on Fedora 25.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: bash randomly logs out on VT / serial console

2017-01-02 Thread Russell King
On Mon, Jan 02, 2017 at 02:06:01PM -0500, Chet Ramey wrote:
> On 1/2/17 6:09 AM, Russell King wrote:
> > On Sun, Jan 01, 2017 at 09:14:45PM -0500, Chet Ramey wrote:
> >> On 1/1/17 4:01 PM, r...@armlinux.org.uk wrote:
> >>> Bash Version: 4.3
> >>> Patch Level: 30
> >>> Release Status: release
> >>>
> >>> Description:
> >>>   Running:
> >>>   while :; do s=$(sleep .005 | cat); echo $s; done | uniq
> >>>   in a login shell on a virtual terminal or serial console results
> >>>   in the shell randomly logging out after ^C.
> >>
> >> I believe this is fixed in bash-4.4.
> > 
> > Thank you for your reply.
> > 
> > However, while testing bash-4.4 with the five additional patches (bringing
> > it to 4.4.5) shows that it is not fixed in bash-4.4 - see below.  The line
> > from __tty_check_change() is from the debug I added to the kernel, which
> > shows that the reason for the EIO error is because the tty pgrp doesn't
> > match the process' pgrp.
> 
> OK.  I can't reproduce it on Fedora 25.

Thank you for your reply.

Here's more information from the strace logs in case it helps - these
are from the "useful" command rather than the one which I quoted in
the bug report which reproduces it as well - "mii-diag" for the "sleep"
and "sed" for the "cat" part - otherwise identical.  Note that I'm
running a modified mii-diag here...


PID 10928 - the top level login shell:
=

17:38:38.716472 pipe2([3, 4], 0)= 0
17:38:38.716555 rt_sigprocmask(SIG_BLOCK, [INT CHLD], [CHLD], 8) = 0
17:38:38.716617 pipe2([5, 6], 0)= 0
17:38:38.716677 clone(child_stack=0, 
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
child_tidptr=0x7fae66c0d0) = 14119
17:38:38.717076 setpgid(14119, 14119)   = 0
17:38:38.717185 rt_sigprocmask(SIG_SETMASK, [CHLD], NULL, 8) = 0
17:38:38.717249 close(4)= 0
17:38:38.717301 close(4)= -1 EBADF (Bad file descriptor)
17:38:38.717398 rt_sigprocmask(SIG_BLOCK, [INT CHLD], [CHLD], 8) = 0
17:38:38.717463 clone(child_stack=0, 
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
child_tidptr=0x7fae66c0d0) = 14120
17:38:38.717936 setpgid(14120, 14119)   = 0
17:38:38.719654 rt_sigprocmask(SIG_SETMASK, [CHLD], NULL, 8) = 0
17:38:38.719764 close(3)= 0
17:38:38.722508 rt_sigprocmask(SIG_BLOCK, [CHLD], [CHLD], 8) = 0
17:38:38.722576 close(5)= 0
17:38:38.722626 close(6)= 0
17:38:38.745323 ioctl(255, TIOCGPGRP, [14119]) = 0
17:38:38.745381 rt_sigprocmask(SIG_SETMASK, [CHLD], NULL, 8) = 0
17:38:38.745501 rt_sigprocmask(SIG_BLOCK, [CHLD], [CHLD], 8) = 0
17:38:38.745560 wait4(-1, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGINT}], 
WSTOPPED|WCONTINUED, NULL) = 14120
17:38:39.477455 wait4(-1, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGINT}], 
WSTOPPED|WCONTINUED, NULL) = 14119

Here, the two sub-shells (pid 14119 and 14120) that bash has
spawned were terminated by SIGINT.

17:38:39.477971 rt_sigprocmask(SIG_BLOCK, [CHLD TSTP TTIN TTOU], [CHLD], 8) = 0
17:38:39.478119 ioctl(255, SNDRV_TIMER_IOCTL_SELECT or TIOCSPGRP, [10928]) = 0
17:38:39.478250 rt_sigprocmask(SIG_SETMASK, [CHLD], NULL, 8) = 0
17:38:39.478395 ioctl(255, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE 
or TCGETS, {B115200 opost isig icanon echo ...}) = 0
17:38:39.478626 ioctl(255, SNDCTL_TMR_STOP or SNDRV_TIMER_IOCTL_GINFO or 
TCSETSW, {B115200 opost isig icanon echo ...}) = 0
17:38:39.478782 ioctl(255, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE 
or TCGETS, {B115200 opost isig icanon echo ...}) = 0
17:38:39.479323 ioctl(255, TIOCGWINSZ, {ws_row=25, ws_col=80, ws_xpixel=0, 
ws_ypixel=0}) = 0
17:38:39.479488 write(1, "\n", 1)   = 1
17:38:39.479583 rt_sigprocmask(SIG_SETMASK, [CHLD], NULL, 8) = 0
17:38:39.479653 close(3)= -1 EBADF (Bad file descriptor)
17:38:39.479711 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
17:38:39.479802 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, 
si_pid=14120, si_status=SIGINT, si_utime=0, si_stime=0} ---
17:38:39.479857 wait4(-1, 0x7fc81f8300, WNOHANG|WSTOPPED|WCONTINUED, NULL) = -1 
ECHILD (No child processes)
17:38:39.479921 rt_sigreturn()  = 0
17:38:39.480088 rt_sigaction(SIGINT, {0x45e080, [], 0}, {0x45e080, [], 0}, 8) = 0
17:38:39.480311 rt_sigprocmask(SIG_BLOCK, [CHLD TSTP TTIN TTOU], [], 8) = 0
17:38:39.480458 ioctl(255, SNDRV_TIMER_IOCTL_SELECT or TIOCSPGRP, [10928]) = 0

Here, the top level shell sets the TTY PGRP to 10928.  Note
the time stamp - 17:38:39.480458.

17:38:39.480614 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
17:38:39.480740 rt_sigaction(SIGINT, {0x45e080, [], 0}, {0x45e080, [], 0}, 8) = 0
17:38:39.480941 ioctl(0, TIOCGWINSZ, {ws_row=25, ws_col=80, ws_xpixel=0, 
ws_ypixel=0}) = 0
17:38:39.481081 ioctl(0, SNDRV_TIMER_IOCTL_STATUS or TIOCSWINSZ, {ws_row=25, 
ws_col=80, ws_xpixel=0, ws_ypixel=0}) = 0
17:38:39.481198 ioctl(0, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE 
or TCGETS, {B115200 opost isig icanon echo ...}) = 0
17:38: