Re: Bug#429021: builtin echo command redirection misbehaves in detached scripts when terminal is closed

2007-09-10 Thread Stephane Chazelas
On Mon, Sep 10, 2007 at 12:05:57AM +0200, Aurelien Jarno wrote:
[...]
> >>> bash -c 'echo a; echo b > a' >&-
> >>>
> >>> is enough for me to reproduce the problem.

[both "a" and "b" seen in file "a".]

> >> Guess you have a buggy libc, then.
> > [...]
> > 
> > I wouldn't be surprised if it has to do with the fix to debian
> > bug #429021. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=429021
> > (I'm CCing Dmitry who is the author of that change according to
> > bugs.debian.org)
> > 
> 
> I can reproduce the "bug" with glibc from etch, or even from sarge, so I
> really doubt that it comes from this change.
[...]

Hi Aurelien.

The reason I suspected that is that Andreas with a glibc-2.6.1
was not seeing the problem so that it could be because it was a
debian issue. Also Pierre-Philippe says it is not in debian
unstable from 1st of May 2007 (glibc-2.5 based). And the only
diff on libio/fileops.c in glibc-2.6.1-2 is that fix for 429021,
and the log for that bug talks of something very related.

I could not reproduce the problem with a glibc-2.3.4 on an old
RedHat system. That version of glibc was inbetween sarge's
(2.3.2) and etch's (2.3.6).

Andreas, could you please confirm which distribution of Linux
you have and which version of the libc package?

All in all, it would suggest that the change was introduced by
debian if not in the fix for 429021. To sum up:

glibc's fflush seems to empty its buffer upon a unsuccessful
fflush() (a fflush(3) where the write(2) fails) on
  - debian unstable glibc 2.5 (according to Pierre-Philippe)
  - Andreas' glibc 2.6.1
  - Some RedHat glibc 2.3.4 (according to me)
  - Solaris 7 system libc (not glibc)
  - HPUX 11.11 system libc (not glibc)

And it seems not to empty it in
  - debian unstable 2.6.1-2 (according to me and
  Pierre-Philippe)
  - debian etch (2.3.6?) according to Aurelien
  - debian sarge (2.3.2?) according to Aurelien

Best regards,
Stéphane




Re: builtin echo command redirection misbehaves in detached scripts when terminal is closed

2007-09-10 Thread Stephane Chazelas
On Mon, Sep 10, 2007 at 11:56:33AM +0400, Dmitry Potapov wrote:
> On Sun, Sep 09, 2007 at 10:18:07PM +0100, Stephane Chazelas wrote:
> > Now, I'm not sure if we can say that the new glibc behavior
> > observed is bogus (other than it's different from the behavior
> > observed in all the libcs I tried with).
> 
> What libc have you tried?
> 
> To me, the new behavior makes much more sense, as dropping buffer on
> error is really weird thing to do. I have looked at the source code of
> newlib and dietlibc, none of them drops buffer on error, and I am not
> aware about any other implementation of libc that does.

Hi Dmitry,

thanks for replying, I gave a list in another email. I tried on
Solaris 7 and HPUX and both seem to flush the buffer upon an
unsuccessful fflush()

> > It is not a harmless
> > change, for sure as it seems to have broken at least bash, zsh
> > and possibly ksh93.
> 
> Unfortunately, you are right. I did not foresee that some shells may use
> "dup2(open("file.txt"), fileno(stdout))". It is a dirty hack, which may
> cause some other problems. Frankly, I am a bit surprised that bash uses
> printf instead of write(2).  BTW, you cannot use 'printf' in signal
> handlers, so it seems that you cannot use 'echo' in trap commands too.
> 
> Perhaps, we should rollback my patch and give some time for developers
> to fix their broken shells, but, in this case, what is actually broken
> are those shells, not libc!
[...]

On the other end, how would you force the flush of the buffer?

And how would you redirect stdout? We can use freopen() instead
of the hack above for files, but not for pipes or arbitrary fds
(as in >&3). Erik Blake was suggesting to use freopen(NULL) (not
to fix that very problem but because of the fact that if you
reassign stdout to some resource of a different nature, you need
to tell stdio as stdio may need to operate differently), but
that's not very portable according to POSIX. Would freopen(NULL)
flush the output buffer?

You cannot simply assign stdout to some value returned by
fdopen() as that's not portable either...

-- 
Stéphane




Re: builtin echo command redirection misbehaves in detached scripts when terminal is closed

2007-09-10 Thread Stephane Chazelas
On Mon, Sep 10, 2007 at 02:17:41PM +0400, Dmitry Potapov wrote:
[...]
> On Mon, Sep 10, 2007 at 09:08:33AM +0100, Stephane Chazelas wrote:
> > thanks for replying, I gave a list in another email. I tried on
> > Solaris 7 and HPUX and both seem to flush the buffer upon an
> > unsuccessful fflush()
> 
> I see... I wonder how they work in regard of my original problem
> described in the Bug#429021, because it is possible to not discard data
> when write failed, but still clean buffer in fflush(). So, functions
> like fwrite, printf will not lose some previously written data on error,
> but fflush() will always have a clean output buffer at return, so
> it will not break existing software, which use dup2 trick.

I'll investigate this evening (BTW, it wasn't Solaris 7, but
Solaris 8).

> > On the other end, how would you force the flush of the buffer?
> 
> The flush means to _deliver_ data, which is impossible in this case.

Sorry, I meant flush() as in emptying the buffer (wether
flushing it to the fd or down the drain (discard it)).

BTW, does anybody know why our emails don't seem to make it to
the bash mailing list anymore?

-- 
Stéphane




Re: builtin echo command redirection misbehaves in detached scripts when terminal is closed

2007-09-10 Thread Dmitry Potapov
On Sun, Sep 09, 2007 at 10:18:07PM +0100, Stephane Chazelas wrote:
> Now, I'm not sure if we can say that the new glibc behavior
> observed is bogus (other than it's different from the behavior
> observed in all the libcs I tried with).

What libc have you tried?

To me, the new behavior makes much more sense, as dropping buffer on
error is really weird thing to do. I have looked at the source code of
newlib and dietlibc, none of them drops buffer on error, and I am not
aware about any other implementation of libc that does.

> It is not a harmless
> change, for sure as it seems to have broken at least bash, zsh
> and possibly ksh93.

Unfortunately, you are right. I did not foresee that some shells may use
"dup2(open("file.txt"), fileno(stdout))". It is a dirty hack, which may
cause some other problems. Frankly, I am a bit surprised that bash uses
printf instead of write(2).  BTW, you cannot use 'printf' in signal
handlers, so it seems that you cannot use 'echo' in trap commands too.

Perhaps, we should rollback my patch and give some time for developers
to fix their broken shells, but, in this case, what is actually broken
are those shells, not libc!

Regards,
Dmitry




Re: Bug#429021: builtin echo command redirection misbehaves in detached scripts when terminal is closed

2007-09-10 Thread Dmitry Potapov
On Mon, Sep 10, 2007 at 12:05:57AM +0200, Aurelien Jarno wrote:
> > I wouldn't be surprised if it has to do with the fix to debian
> > bug #429021. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=429021
> > (I'm CCing Dmitry who is the author of that change according to
> > bugs.debian.org)
> > 
> 
> I can reproduce the "bug" with glibc from etch, or even from sarge, so I
> really doubt that it comes from this change.

I can NOT reproduce the problem with glibc from etch, and I do believe
that my patch caused the aforementioned problem, though I do not think
that the patch was incorrect, as to the real bug lies inside of those
shells.

Dmitry




Re: builtin echo command redirection misbehaves in detached scripts when terminal is closed

2007-09-10 Thread Dmitry Potapov
Hi Stephane,

On Mon, Sep 10, 2007 at 09:08:33AM +0100, Stephane Chazelas wrote:
> thanks for replying, I gave a list in another email. I tried on
> Solaris 7 and HPUX and both seem to flush the buffer upon an
> unsuccessful fflush()

I see... I wonder how they work in regard of my original problem
described in the Bug#429021, because it is possible to not discard data
when write failed, but still clean buffer in fflush(). So, functions
like fwrite, printf will not lose some previously written data on error,
but fflush() will always have a clean output buffer at return, so
it will not break existing software, which use dup2 trick.

> On the other end, how would you force the flush of the buffer?

The flush means to _deliver_ data, which is impossible in this case.

> And how would you redirect stdout? We can use freopen() instead
> of the hack above for files, but not for pipes or arbitrary fds
> (as in >&3). 

I see... POSIX has fdopen to create a stream based on the existing
file descriptor, but there is no function to change an existing
stream like 'stdout'. So, I don't know any other portable solution
except avoiding 'stdout'. For some implementations, you can just
assign any FILE pointer to stdout like this:

FILE* out = fdopen(fd, mode);
if (out != NULL)
  {
fclose(stdout);
stdout = out;
  }
else
  report_error();

but in general it does not work, because stdout is rvalue.

> Erik Blake was suggesting to use freopen(NULL) (not
> to fix that very problem but because of the fact that if you
> reassign stdout to some resource of a different nature, you need
> to tell stdio as stdio may need to operate differently), but
> that's not very portable according to POSIX. Would freopen(NULL)
> flush the output buffer?

In Glibc, freopen:

  if (filename == NULL && _IO_fileno (fp) >= 0) 
{ 
  fd = __dup (_IO_fileno (fp));
  if (fd != -1)
filename = fd_to_filename (fd);
}

Then it closes, the original stream and opens a new one in
the same place. So I believe it should work with glibc
provided you do that you called it after dup2 and that your
system have /proc, because fd_to_filename relies on it.

freopen in newlib does not do anything special about NULL,
so I believe it does not work with NULL.

Perhaps, freopen("/dev/stdout") is a more portable way to
do what you want.

Regards,
Dmitry




Re: builtin echo command redirection misbehaves in detached scripts when terminal is closed

2007-09-10 Thread Chet Ramey
Dmitry Potapov wrote:

> 
> Unfortunately, you are right. I did not foresee that some shells may use
> "dup2(open("file.txt"), fileno(stdout))". It is a dirty hack, which may
> cause some other problems. Frankly, I am a bit surprised that bash uses
> printf instead of write(2).  BTW, you cannot use 'printf' in signal
> handlers, so it seems that you cannot use 'echo' in trap commands too.

Luckily, neither of these things is true.

What's needed is a portable interface like BSD's fpurge(3).

Chet

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
   Live Strong.  No day but today.
Chet Ramey, ITS, CWRU[EMAIL PROTECTED]http://cnswww.cns.cwru.edu/~chet/




Re: builtin echo command redirection misbehaves in detached scripts when terminal is closed

2007-09-10 Thread Dmitry Potapov
Hello Stephane,

I was wrong about suggestion freopen("/dev/stdout") in my previous mail.
It cannot be used to redirect stdout.

Regards,
Dmitry




Re: builtin echo command redirection misbehaves in detached scripts when terminal is closed

2007-09-10 Thread Andreas Schwab
Chet Ramey <[EMAIL PROTECTED]> writes:

> What's needed is a portable interface like BSD's fpurge(3).

This is also available from glibc as __fpurge (likewise on Solaris).

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."




Re: builtin echo command redirection misbehaves in detached scripts when terminal is closed

2007-09-10 Thread Eric Blake-1

> What's needed is a portable interface like BSD's fpurge(3).

Gnulib provides this[1].  Maybe you should consider using
gnulib to enhance the portability of future versions of bash.

[1] http://www.gnu.org/software/gnulib/MODULES.html#module=fpurge

-- 
Eric Blake

-- 
View this message in context: 
http://www.nabble.com/builtin-echo-command-redirection-misbehaves-in-detached-scripts-when-terminal-is-closed-tf4409627.html#a12594005
Sent from the Gnu - Bash mailing list archive at Nabble.com.





Re: builtin echo command redirection misbehaves in detached scripts when terminal is closed

2007-09-10 Thread Chet Ramey
Andreas Schwab wrote:
> Chet Ramey <[EMAIL PROTECTED]> writes:
> 
>> What's needed is a portable interface like BSD's fpurge(3).
> 
> This is also available from glibc as __fpurge (likewise on Solaris).

Yes, though I have an aversion to calling functions with a `__' prefix
from user application code.

However:

"These  functions  are  nonstandard  and  not  portable."

It would be nice to have something standardized.  I can certainly add
yet another configure test for this -- I just wish I didn't have to.

Chet

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
   Live Strong.  No day but today.
Chet Ramey, ITS, CWRU[EMAIL PROTECTED]http://cnswww.cns.cwru.edu/~chet/




Re: builtin echo command redirection misbehaves in detached scripts when terminal is closed

2007-09-10 Thread Stephane Chazelas
On Mon, Sep 10, 2007 at 11:57:34AM -0400, Chet Ramey wrote:
> Andreas Schwab wrote:
> > Chet Ramey <[EMAIL PROTECTED]> writes:
> > 
> >> What's needed is a portable interface like BSD's fpurge(3).
> > 
> > This is also available from glibc as __fpurge (likewise on Solaris).
> 
> Yes, though I have an aversion to calling functions with a `__' prefix
> from user application code.
> 
> However:
> 
> "These  functions  are  nonstandard  and  not  portable."
> 
> It would be nice to have something standardized.  I can certainly add
> yet another configure test for this -- I just wish I didn't have to.
[...]

Note that zsh seems to have the same problem as bash here
(except that it uses fwrite + fputc instead of printf).

The problem I saw with ksh93 seems to be unrelated as ksh93
doesn't seem to be using stdio.

Dmitry, your t.c in the debian report gives:

On Solaris 8:

$ ./t
signal handler called, sig=2
error at num_bytes=15352
fputs: Interrupted system call
writer: num_bytes=8 num_lines=1
reader: num_bytes=74888 num_lines=9361
reader: number of missing bytes: 5112

On HPUX 11.11:

$ ./t
signal handler called, sig=2
error at num_bytes=16376
fputs: Interrupted system call
fclose: Interrupted system call
reader: num_bytes=71816 num_lines=8977
reader: number of missing bytes: 8184

So they don't seem to care either to retry and send the data
if the first write() fails.

With dietlibc:

$ ./t
signal handler called, sig=2
writer: num_bytes=80008 num_lines=10001
writer: expected num_bytes=8 but was 80008
reader: num_bytes=80007 num_lines=1
reader: number of missing bytes: -7

And dietlibc behaves the same as glibc patched with your
(Dmitry's) change upon the fflush. That is bash would misbehave
the same if linked against dietlibc.

I've also verified that if I revert your change and recompile
the glibc, bash's (and zsh's) problem goes away, so that would
confirm if needed be that it was that fix that introduced the
change in behavior.

-- 
Stéphane




Re: builtin echo command redirection misbehaves in detached scripts when terminal is closed

2007-09-10 Thread Dmitry Potapov
On Mon, Sep 10, 2007 at 05:39:09PM +0100, Stephane Chazelas wrote:
> Dmitry, your t.c in the debian report gives:
> 
> On Solaris 8:
[...]
> On HPUX 11.11:
[...]
>
> So they don't seem to care either to retry and send the data
> if the first write() fails.

Yes, it seems they purge all data in the IO buffer on error.

> With dietlibc:
> 
> $ ./t
> signal handler called, sig=2
> writer: num_bytes=80008 num_lines=10001
> writer: expected num_bytes=8 but was 80008
> reader: num_bytes=80007 num_lines=1
> reader: number of missing bytes: -7
> 
> And dietlibc behaves the same as glibc patched with your
> (Dmitry's) change upon the fflush.

No, glibc with my patch gives:

$ ./t 
signal handler called, sig=2
error at num_bytes=69632
fputs: Interrupted system call
writer: num_bytes=8 num_lines=1
reader: num_bytes=8 num_lines=1

-7 indicates an error in dietlibc. Somehow, dietlibc does not take into
account that write(2) can write only part of data, and it should not be
considered as an error.  But this bug in dietlibc is irrelevant to our
problem. Newlib should work as glibc with my patch, but I have not
tested it.

Dmitry




Re: builtin echo command redirection misbehaves in detached scripts when terminal is closed

2007-09-10 Thread Stephane Chazelas
On Mon, Sep 10, 2007 at 09:25:26PM +0400, Dmitry Potapov wrote:
[...]
> > With dietlibc:
> > 
> > $ ./t
> > signal handler called, sig=2
> > writer: num_bytes=80008 num_lines=10001
> > writer: expected num_bytes=8 but was 80008
> > reader: num_bytes=80007 num_lines=1
> > reader: number of missing bytes: -7
> > 
> > And dietlibc behaves the same as glibc patched with your
> > (Dmitry's) change upon the fflush.
> 
> No, glibc with my patch gives:
[...]

Sorry for the misunderstanding, I meant "upon the fflush", as in
wrt the issue at stake, that is the fact that dietlibc doesn't
seem to empty the output buffer upon an unsuccessful fflush
either, which confirms what you suspected earlier through
reading the dietlibc code. I did not mean that "t" was behaving
the same in glibc and dietlibc. With the glibc, I obtain:

$ ~/t
signal handler called, sig=2
error at num_bytes=66560
fputs: Interrupted system call
reader: num_bytes=8 num_lines=1
writer: num_bytes=8 num_lines=1

And with your fix reverted:

.../glibc-2.6.1/build-tree/i386-libc$ LD_LIBRARY_PATH=$PWD ~/t
signal handler called, sig=2
error at num_bytes=66560
fputs: Interrupted system call
writer: num_bytes=8 num_lines=1
reader: num_bytes=78976 num_lines=9872
reader: number of missing bytes: 1024

as expected.

Best regards,
Stéphane