On 27/10/2015 23:17, Al Viro wrote:
Frankly, as far as I'm concerned, the bottom line is * there are two variants of semantics in that area and there's not much that could be done about that.
Yes, that seems to be the case.
* POSIX is vague enough for both variants to comply with it (it's also very badly written in the area in question).
On that aspect I disagree, the POSIX semantics seem clear to me, and are different to the Linux behaviour.
* I don't see any way to implement something similar to Solaris behaviour without a huge increase of memory footprint or massive cacheline pingpong. Solaris appears to go for memory footprint from hell - cacheline per descriptor (instead of a pointer per descriptor).
Yes, that does seem to be the case. Thanks for the detailed explanation you've provided as to why that's so.
* the benefits of Solaris-style behaviour are not obvious - all things equal it would be interesting, but the things are very much not equal. What's more, if your userland code is such that accept() argument could be closed by another thread, the caller *cannot* do anything with said argument after accept() returns, no matter which variant of semantics is used.
Yes, irrespective of how you terminate the accept, once it returns with an error it's unsafe to use the FD, with the exception of failures such as EAGAIN, EINTR etc. However the shutdown() behaviour of Linux is not POSIX compliant and allowing an accept to continue of a FD that's been closed doesn't seem correct either.
* [Linux-specific aside] our __alloc_fd() can degrade quite badly with some use patterns. The cacheline pingpong in the bitmap is probably inevitable, unless we accept considerably heavier memory footprint, but we also have a case when alloc_fd() takes O(n) and it's _not_ hard to trigger - close(3);open(...); will have the next open() after that scanning the entire in-use bitmap. I think I see a way to improve it without slowing the normal case down, but I'll need to experiment a bit before I post patches. Anybody with examples of real-world loads that make our descriptor allocator to degrade is very welcome to post the reproducers...
It looks like the remaining discussion is going to be about Linux implementation details so I'll bow out at this point. Thanks again for all the helpful explanation.
-- Alan Burlison -- -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html