On Mon, 2016-12-05 at 16:37 +0100, Jesper Dangaard Brouer wrote:

> Do you think the splice technique would, have the same performance
> benefit as having a MPMC queue with separate enqueue and dequeue locking?
> (like we have with skb_array/ptr_ring that avoids cache bouncing)?

I believe ring buffers make sense for critical points in the kernel,
but for an arbitrary number of TCP/UDP sockets in a host, they are a big
increase of memory, and a practical problem when SO_RCVBUF is changed,
since dynamic resize of the ring buffer would be needed.

If you think about it, most sockets have few outstanding packets, like
0, 1 , 2. But they also might have ~100 packets, sometimes...

For most of TCP/UDP sockets, a linked list is simply good enough.
( We only very recently converted the out of order receive queue to an
RB tree )

Now, if _two_ linked list are also good in the very rare case of floods,
I would use two linked lists, if they can offer us a 50 % increase at
small memory cost.

Then for very special cases, we have af_packet which should be optimized
for all the fancy stuff.

If an application really receives more than 1.5 Mpps per UDP socket,
then the author should seriously consider SO_REUSEPORT, and have more
than 1 vcpu on its VM. I think we have cheap cloud offers available from
many providers.

The ring buffer queue might make sense in net/core/dev.c, since
we currently have 2 queues per cpu.

So you might want to experiment with that, because it looks like we
might go to a model where a single cpu is (busypoll) processing all low
level RX processing from a single queue per NUMA node, then dispatch to
other cpus the IP/{TCP|UDP} processing.




Reply via email to