It is currently impossible for the reader of an AF_UNIX stream socket to fully reconstruct the data sent in the presence of SCM_RIGHTS, without reading byte-for-byte. This prevents efficiently proxying or providing a high-level buffering interface to these sockets.
Unfortunately POSIX does not specify the behavior of SCM_RIGHTS beyond the existence of the constant and in general the semantics of ancillary data attached to streams isn't on the firmest ground. The following is how these concepts work on every *nix I am aware of. An SCM_RIGHTS array of file descriptors is uniquely associated with a single byte in the stream; this byte can be identified by reading the stream one byte at a time. sendmsg() associates the given fds with the first byte in the buffer provided. When recvmsg() returns fds, we know that exactly one fd-associated byte appears in the buffer; if necessary partial reads are employed to guarantee this invariant. The issue in question concerns recvmsg(). Linux allows the fd-associated byte to appear anywhere within the buffer; the reader has no way of knowing which one it is. Other *nix systems (at least macOS/Solaris/FreeBSD), make recvmsg() symmetric with sendmsg() and guarantee that the fd-associated byte is the first byte in the buffer. The first change of this patch has Linux adopt the stronger semantics of its peers, which fixes the issue at hand while also bringing us closer to standardizing SCM_RIGHTS. The existing implementation enforces the one-fd-associated-byte constraint with a partial read after any skb with attached fds. The second & third changes remove this unnecessary constraint, allowing data from subsequent skbs to be copied to the buffer, as long those skbs do not have fds attached to them. Signed-off-by: Christopher Monsanto <ch...@monsan.to> --- net/unix/af_unix.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index ddb838a1b74c..761837a7da65 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -2297,6 +2297,9 @@ static int unix_stream_read_generic(struct unix_stream_read_state *state, unix_state_unlock(sk); + if (UNIXCB(skb).fp && copied > 0) + break; + if (check_creds) { /* Never glue messages from different writers */ if (!unix_skb_scm_eq(skb, &scm)) @@ -2356,9 +2359,6 @@ static int unix_stream_read_generic(struct unix_stream_read_state *state, skb_unlink(skb, &sk->sk_receive_queue); consume_skb(skb); - - if (scm.fp) - break; } else { /* It is questionable, see note in unix_dgram_recvmsg. */ @@ -2367,9 +2367,6 @@ static int unix_stream_read_generic(struct unix_stream_read_state *state, sk_peek_offset_fwd(sk, chunk); - if (UNIXCB(skb).fp) - break; - skip = 0; last = skb; last_len = skb->len; -- 2.17.1