On Fri, Oct 26, 2007 at 03:09:01PM -0700, Stephen Hemminger wrote: > > close() from another thread is not a way to abort blocked accept(). Never > > promised to be that. Just as close() from another thread is not a way to > > abort blocked write() or read() or sendmsg() or... > > The problem is the Linux interpretation conflicts with the expectation > of applications that run on other Unix systems. Most likely, it is > one of those corner cases not covered by SUS or Posix specs otherwise > it would have come up earlier. The existing Linux behavior works fine > it just isn't expected (or well documented). > > I'm fine with just closing the bug (which is what I did initially), but > where should this get documented?
close(2), perhaps? "System call on opened file holds a reference to opened file regardless of what happens to descriptor originally passed to it" or something to the same effect... That's what really happens - you get the same effect as if there had been an additional temporary opened descriptor for that sucker. And really, multithreaded application that has one thread rip descriptors from under another should be damn careful on _any_ system. Anything that goes "I've got -EBADF, guess another thread had removed that descriptor, got to recover" is insane - in effect, it calls accept() blindly and hopes that race will play out nicely, without hitting * thread A calls accept(3) * thread B calls close() * thread B calls e.g. dup() for unrelated reason and gets the same descriptor reused * thread A finally gets from libc to accept(2), sees no EBADF and proceeds with accept() on completely unrelated socket, with no indication of the problem (or returns giving you a bogus errno, depending on what the hell that descriptor happens to be). IOW, if you rely on -EBADF to deal with such (userland) races, you are extremely likely to be screwed. On Linux, on FreeBSD, on Solaris, whatever. In very controlled circumstances you might get away with that, but it's almost certainly a Very Bad Idea(tm). The bottom line: if descriptor table is a shared resource in your multithreaded program, treat it as such. Kernel will survive having descriptors closed in the middle of syscall just fine; your userland code is a different story. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html