Re: Fw: [Bug 106241] New: shutdown(3)/close(3) behaviour is incorrect for sockets in accept(3)

Alan Burlison Thu, 22 Oct 2015 03:56:44 -0700

On 22/10/2015 05:21, Al Viro wrote:

Most of the work on using a file descriptor is local to the thread.


Using - sure, but what of cacheline dirtied every time you resolve a
descriptor to file reference?


Don't you have to do that anyway, to do anything useful with the file?

How much does it cover and to what
degree is that local to thread?  When it's a refcount inside struct file -
no big deal, we'll be reading the same cacheline anyway and unless several
threads are pounding on the same file with syscalls at the same time,
that won't be a big deal.  But when refcounts are associated with
descriptors...

There is a refcount in the struct held in the per-process list of openfiles and the 'slow path' processing is only taken if there's more thanone LWP in the process that's accessing the file.

In case of Linux we have two bitmaps and an array of pointers associated
with descriptor table.  They grow on demand (in parallel)
        * reserving a descriptor is done under ->file_lock (dropped/regained
around memory allocation if we end up expanding the sucker, actual reassignment
of pointers to array/bitmaps is under that spinlock)
        * installing a pointer is lockless (we wait for ongoing resize to
settle, RCU takes care of the rest)
        * grabbing a file by index is lockless as well
        * removing a pointer is under ->file_lock, so's replacing it by dup2().


Is that table per-process or global?

The point is, dup2() over _unused_ descriptor is inherently racy, but dup2()
over a descriptor we'd opened and kept open should be safe.  As it is,
their implementation boils down to "userland must do process-wide exclusion
between *any* dup2() and all syscalls that might create a new descriptor -
open()/pipe()/socket()/accept()/recvmsg()/dup()/etc".  At the very least,
it's a big QoI problem, especially since such userland exclusion would have
to be taken around the operations that can block for a long time.  Sure,
POSIX wording regarding dup2 is so weak that this behaviour can be argued
to be compliant, but... replacement of the opened file associated with
newfd really ought to be atomic to be of any use for multithreaded processes.

There's existing language in the Issue 7 dup2() description that saysdup2() has to be atomic:


"the dup2( ) function provides unique services, as no other
interface is able to atomically replace an existing file descriptor."

And there is some new language in Issue 7 Technical Corrigenda 2 thatreinforces that, when it's talking about reassignment ofstdin/stdout/stderr:

"Furthermore, a close() followed by a reopen operation (e.g. open(),dup() etc) is not atomic; dup2() should be used to change standard filedescriptors."

I don't think that it's possible to claim that a non-atomic dup2() isPOSIX-compliant.

IOW, if newfd is an opened descriptor prior to dup2() and no other thread
attempts to close it by any means, there should be no window during which
it would appear to be not opened.  Linux and FreeBSD satisfy that; OpenBSD
seems to do the same, from the quick look.  NetBSD doesn't, no idea about
Solaris.  FWIW, older NetBSD implementation (prior to "File descriptor changes,
discussed on tech-kern:" back in 2008) used to behave like OpenBSD one; it
had fixed a lot of crap, so it's entirely possible that OpenBSD simply has
kept the old implementation, with tons of other races in that area, but this
dup2() race got introduced in that rewrite.

Related to dup2(), there's some rather surprising behaviour on Linux.Here's the scenario:


----------

ThreadA opens, listens and accepts on socket fd1, waiting for incomingconnections.


ThreadB waits for a while, then opens normal file fd2 for read/write.
ThreadB uses dup2 to make fd1 a clone of fd2.
ThreadB closes fd2.

ThreadA remains sat in accept on fd1 which is now a plain file, not asocket.

ThreadB writes to fd1, the result of which appears in the file, so fd1is indeed operating as a plain file.


ThreadB exits. ThreadA is still sat in accept on fd1.

A connection is made to fd1 by another process. The accept call succeedsand returns the incoming connection. fd1 is still operating as asocket, even though it's now actually a plain file.

----------

I assume this is another consequence of the fact that threads waiting inaccept don't get a notification if the fd they are using is closed,either directly by a call to close or by a syscall such as dup2. Notwaking up other threads on a fd when it is closed seems like it'sundesirable behaviour.

I can see the reasoning behind allowing shutdown to be used to do such awakeup even if that's not POSIX-compliant - it may make it slightlyeasier for applications avoid fd recycling races. However the currentsituation is that shutdown is the *only* way to perform such a wakeup -simply closing the fd has no effect on any other threads. That seemsincorrect.


--
Alan Burlison
--
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Fw: [Bug 106241] New: shutdown(3)/close(3) behaviour is incorrect for sockets in accept(3)

Reply via email to