On Sun, May 8, 2011 at 16:42, N.M. Maclaren <n...@cam.ac.uk> wrote: > On May 8 2011, Janne Blomqvist wrote: >>> >>> the error printing functionality (in io/unix.c) st_printf and >>> st_vprintf are not thread-safe as they use a static buffer. ... >> >> While this patch makes error printing thread-safe, it's no longer >> async-signal-safe as the stderr lock might lead to a deadlock. So I'm >> retracting this patch and thinking some more about this problem. > > It's theoretically insoluble, given the constraints you are working > under. Sorry. It is possible to do reasonably well, but there will > always be likely scenarios where all you can do is to say "Aargh! > I give up."
Well, I realize perfection is impossible, so I'm settling for merely improving the status quo! > Both I and the VMS people adopted the ratchet design. You have N > levels of error recovery, each level allocates all of the resources > it needs before startup, and any exception during level K increases > the level to K+1 and calls the level K+1 error handler. When you > have an exception at level N, you just die. To some extent we have a crude version of this, in that when we're entering many of the fatal error handling functions we do a recursion check and if that fails, die. Also, in a recent patch of mine (http://gcc.gnu.org/ml/gcc-patches/2011-05/msg00584.html ) the fatal signal handler function has been reworked to hopefully deal better with other signal(s) being delivered before it's done; that code is modeled after an example in the glibc manual, and I'm a bit unsure if the recursion check thingy really works or we just end up in an infinite recursion (that is, do we need to re-set to the default handler before re-raising? I have a vague memory that the signal handler for SIGXXX must finish before starting the handler for another SIGXXX pending signal, which would make the current version safe). > That imposes the constraint that all diagnostics have a fixed upper > bound on the resources they need (not just buffer space, but that's > the main one). It's a real bummer when the system has some critical > resources that you can't reserve, so you have to treat an allocation > failure as an exception, but buffer space is not one such. > > That also tackles the thread problem, not very satisfactorily. If a > resource needs to be locked, you can try to get it for a bit, and > then raise a higher exception if you can't. And, typically, one or > more of the highest levels are for closing down the process, and > simply suspend any subsequent threads that call them (i.e. just leave > them waiting for a lock they won't get). I think in our case the situation is a bit easier in that we're not trying to recover from a serious failure, merely print some diagnostic information without getting stuck in a deadlock. -- Janne Blomqvist