The intermittent failures on Darwin are due to a kernel bug tripped by java.lang.Process.waitFor().

The bug appears to be that if:
- the program is multithreaded
- it is blocking SIGCHLD
- it receives a SIGCHLD due to a process terminating
- later it calls sigsuspend (but not sigwait)

then the SIGCHLD may never be delivered, and so the process will wait for one forever.

It's intermittent because it works fine if the sigsuspend starts before the SIGCHLD is sent. This also explains why it happens more often with gij.

I've filed this as <rdar://problem/4736203>. We could work around it by using a timeout of some kind; for example, creating a new thread which sends a SIGCHLD manually after some period of time. (Obviously, only on Darwin, and maybe only on versions with the bug.) Do we think this is a good idea?

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to