On Fri, 14 Mar 2008, Marc Lehmann wrote:

to stdout (because the terminal is suspended) it interrupts the syscall.

Hmm, maybe you should read the manpage for write. sending xoff to a terminal
will certainly not cause any EINTR whatsoever. write'ing to a tty when it is
stopped will either cause the program to block or to get EAGAIN, regardless
of any buffering.

I did read the manpage for write, and it only said this:

        EINTR  The call was interrupted by a signal before any data was written.

which did not help me a lot.


EINTR simply means a signal was caught during a syscall and causes the
syscall to abort with EINTR.

Thanks for telling me that now :)


This is a classical race condition, and the bug is simply that dstat
doesn't handle EINTR correctly.

No, it didn't because it should not have happened during a write. In fact it only happens when the write takes too long, apparently.


(In fact, using alarm for timing is rather weird, and certainly will cause
a lot of race conditions like these. The fix is to use a more sensible way
to do timings (which would increase timer stability as well), and/or to
corretcly handle EINTR from write, after all, this is unix).

Feel free to offer me a working "better" implementation. As you are not giving me a clue how to do better timing. I am not a programmer and the program works fine for me.


Or maybe I am confused because you used the word "suspended", but in the
examples I gave, nothing was suspended, the tty was simply stopped, but my
recipe to recreate the problem seems straightforward.

Suspended, stopped, whatever the term is. You obviously know the terminology that I lack.


How would any other program handle this ? How would a real programmer
handle it ? :-)

I quickly did a test with buffering enabled and doing a flush. And it is
able to withstand a terminal suspend longer (until the buffer gets full).
And then again an Interrupted system call. I am wondering whether we can
increase the standard buffer, if that is wise to do and if this is the
right approach.

Why not simply fix it the right way and don't exit on xoff? It works
with cat, it works with vi, it works with vmstat, it works with top, why
shouldn't it work with dstat the same way? It would only fulfill user
expectations caused by similar experience with other programs.

Marc, I do not understand what you are saying here. The write fails because output is unbuffered and the tty is "stopped". So if I would ignore the exception (EINTR) (that is what you are saying right ?) then it is simply NOT being written.

So the correct way is to make the output buffered, and ignore the EINTR when the flush() fails. At least then the write() works fine.

Now that I know the EINT is caused by the SIGALARM I can see how I can avoid the SIGALARM, but ignorant as I am I don't know how else to do better timing (maybe select or some other complicated way).


Fixing the bugs in dstat is much better than coding some weird workaround
to reduce the chance of a race aborting it, but dstat stays buggy as long
as it doesn't correctly handle SIGALARM.

Feel free to send in a patch as you know more about it than me. Or implement your own dstat if you like, and I will happily use your dstat if it works better.

For me dstat was a tool that I needed, if something existed already, I wouldn't have written it. And in fact, if someone else wants to maintain it or fork it or reimplement it and it does what I need I will happily let someone else use his time so I can do other things.

Thank you very much,
--
--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[Any errors in spelling, tact or fact are transmission errors]



--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to