Re: [PATCH net-next 0/5] net: add protocol level recvmmsg support

Eric Dumazet Fri, 25 Nov 2016 13:17:14 -0800

On Fri, 2016-11-25 at 16:39 +0100, Paolo Abeni wrote:
> The goal of recvmmsg() is to amortize the syscall overhead on a possible
> long messages batch, but for most networking protocols, e.g. udp the
> syscall overhead is negligible compared to the protocol specific operations
> like dequeuing.


Problem of recvmmsg() is that it blows up L1/L2 cache of the cpu.
It gives false 'good results' until other threads sharing the same cache
hierarchy are competing with you. Then performance is actually lower
than regular recvmsg().

And I presume your tests did not really use the data once copied to user
space, like doing the typical operations a UDP server does on incoming
packets ?

I would rather try to optimize normal recvmsg(), instead of adding so
much code in the kernel for this horrible recvmmsg() super system call.

Looking at how buggy sendmmsg() was until commit 3023898b7d4aac6
("sock: fix sendmmsg for partial sendmsg"), I fear that these 'super'
system calls are way too complex.

How could we improve UDP ?

For example, we could easily have 2 queues to reduce false sharing and
lock contention.

1) One queue accessed by softirq to append packets.

2) One queue accessed by recvmsg(). Make sure these two queues do not
share a cache line.

When 2nd queue is empty, transfer whole first queue in one operation.

Look in net/core/dev.c , process_backlog() for an example of this
strategy.

Alternative would be to use a ring buffer, although the forward_alloc
stuff might be complex.

Re: [PATCH net-next 0/5] net: add protocol level recvmmsg support

Reply via email to