> From: David Miller [mailto:da...@davemloft.net]
> From: Debabrata Banerjee <dbane...@akamai.com>
> 
> > When TCP_CLOSE_NORST is set before a close(), offload sinking of
> > unwanted data to the kernel with low resource usage, with a timeout of
> > TCP_LINGER2. The socket will transition to FIN_WAIT1 and then
> > FIN_WAIT2 where it will ack data until either the timeout is hit, or a
> > RST or FIN is received.
> >
> 
> This is a very serious protocol violation.

Actually I don't believe it violates the protocol. rfc1122 4.2.2.13 reads:

"A host MAY implement a "half-duplex" TCP close sequence, so
            that an application that has called CLOSE cannot continue to
            read data from the connection.  If such a host issues a
            CLOSE call while received data is still pending in TCP, or
            if new data is received after CLOSE is called, its TCP
            SHOULD send a RST to show that data was lost."

Keyword SHOULD according to rfc2119 means recommended but optional, versus MUST.

> You're telling the remote end that you received the data even though the
> socket was closed and nothing actually "sunk" the bytes.

It does the same thing the application would do, but with much less overhead. 
The application called close() because it no longer cares about new data, but 
it still expected send() prior to close() to actually  send.

> This doesn't even go into the issues of sending cumulative ACKs in response
> to data which is arbitrarily out-of-order.

Not sure what issues we'd run into here, out of order and duplicate acks can 
happen normally.
 
> The whole problem is that the post data is sent before the client looks to see
> if the server is willing to accept the post data or not.
> 
> A: I'd like to send you 200MB of crap
>    [ 200MB of craaaa...
> B: Sorry I won't be accepting that, please don't send it.
> 
>    CLOSE, send reset since some of crap is queued up and
>    was never read
> 
> A: aaaaapp... received RESET
> A: Why didn't B accept my 200MB of crap?

That's correct. But it's not just limited to POST. Could be any data transfer 
over TCP sockets. Of course in the 200MB POST case, that's lots of resources 
and copying to userspace.

> Sorry, you'll need to deal with this issue in another way.

Well if the intersection with the definition of the close() spooks you 
something similar could be implemented as a setsockopt(TCP_SINK_DATA) around 
shutdown(), to instruct the socket to immediately dump data, but with higher 
resource usage. However as above, I don't currently believe this patch violates 
the protocol.

Reply via email to