Hi Ed, On Jun 12 11:39, Ed Martin wrote: > It appears that opening a socket, and then calling connect() on it and > from another thread calling close() on the socket while it's still in > connect() results in a deadlock. Furthermore in this state the thread > cannot be canceled and connect() will never return (my testcase uses > pthread_cancel(), but it happens without that as well)
Thanks for the testcase. I tried this with Cygwin 2.0.4 on 32 bit Windows 7 and 64 bit Windows 8.1. In both cases I get $ ./bug Test started connect: Bad address no bug The "Bad address" isn't exactly right. I changed that to return the same error codes as if shutdown has been called. Note that there's no hang. I can't reproduce a deadlock. The difference is, on Linux connect will continue to hang until the call to pthread_cancel, while on Cygwin it will return with an error message after you call close. I don't see that this behaviour can be emulated under Cygwin due to the way Windows socket event handling works (which is what Cygwin uses under the hood). Anyway, either way should be fine since it unblocks the connect call. However, calling close on a descriptor while performing a system call on this descriptor in another thread is undefined. Even the Linux man page for close warns: It is probably unwise to close file descriptors while they may be in use by system calls in other threads in the same process. Since a file descriptor may be reused, there are some obscure race conditions that may cause unintended side effects. See, e.g http://linux.die.net/man/2/close In Cygwin the problem is that a close() call also removes objects and datastructures connected to the descriptor. Calling close on a descriptor in one thread ultimately lets other, still-running system calls in other threads access wrong memory or synchronization objects. What you should do, in theory, is to to use nonblocking sockets in conjunction with select, or signal the blocking thread so connect returns with EINTR, and only then close the socket. The problem with the latter approach is that it won't work with socket functions in Cygwin up to 2.0.4 :( The reason is that SA_RESTART is enforced in all threads not being the main thread for some reason. The code in question pretty much looks like outdated behaviour. I applied patches to fix or workaround the problems outlined above and uploaded new developer snapshots to https://cygwin.com/snapshots/ Please give'em a try. Thanks, Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat
pgpzfHLxa69zu.pgp
Description: PGP signature