On Tue, 2016-04-12 at 20:31 +0800, Yang Yingliang wrote: > I traced the cost cycles of handling backlog packets in > __release_sock(). > 16.97 ms to handling about 12MB backlog packets, of which 13.66ms to do > sk_data_ready. > The speed of handling packets in TCP is 5.65Gb/s which is smaller than > the NIC's bandwidth. So the packets will be dropped. > > If the cost of sk_data_read cannot be reduced, do we have other choice > exclude dropping packets ?
Normally, TCP stack sends ACK packets with appropriate RWIN. Sender should not send more packets than allowed in RWIN, even if there are 128 threads using one TCP socket, it does not matter. Imagine you do not have a backlog problem (nothing does the sendmsg() while you receive data), and nothing reads the socket. Then the receiver should eventually send WIN 0 back to the sender and sender should stop, before any drop can possibly happen. I have no problem receiving one TCP flow at 34Gbit, so it must be something related to the huge windows you seem to use. One possibility could be to tweak in ACK packets a reduced rwin so that the sender is not allowed to continue the flood while we are painfully processing a huge backlog.