On UDP packets processing, if the BH is the bottle-neck, it always sees a cache miss while updating rmem_alloc; try to avoid it prefetching the value as soon as we have the socket available.
Performances under flood with multiple NIC rx queues used are unaffected, but when a single NIC rx queue is in use, this gives ~10% performance improvement. Signed-off-by: Paolo Abeni <pab...@redhat.com> --- net/ipv4/udp.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index f3450f0..067a607 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1949,6 +1949,7 @@ static int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb) } } + prefetch(&sk->sk_rmem_alloc); if (rcu_access_pointer(sk->sk_filter) && udp_lib_checksum_complete(skb)) goto csum_error; -- 2.9.4