On Tue, 2016-12-06 at 22:47 -0800, Eric Dumazet wrote:
> On Tue, 2016-12-06 at 19:32 -0800, Eric Dumazet wrote:
> > A follow up patch will provide a static_key (Jump Label) since most
> > hosts do not even use RFS.
>
> Speaking of static_key, it appears we now have GRO on UDP, and this
> consumes a considerable amount of cpu cycles.
>
> Turning off GRO allows me to get +20 % more packets on my single UDP
> socket. (1.2 Mpps instead of 1.0 Mpps)
I see also an improvement for single flow tests disabling GRO, but on a
smaller scale (~5% if I recall correctly).
> Surely udp_gro_receive() should be bypassed if no UDP socket has
> registered a udp_sk(sk)->gro_receive handler
>
> And/or delay the inet_add_offload(&udpv{4|6}_offload, IPPROTO_UDP); to
> the first UDP sockets setting udp_sk(sk)->gro_receive handler,
> ie udp_encap_enable() and udpv6_encap_enable()
I had some patches adding explicit static keys for udp_gro_receive, but
they were ugly and I did not get that much gain (I measured ~1-2%
skipping udp_gro_receive only). I can try to refresh them anyway.
We have some experimental patches to implement GRO for plain UDP
connected sockets, using frag_list to preserve the individual skb len,
and deliver the packet to user space individually. With that I got
~3mpps with a single queue/user space sink - before the recent udp
improvements. I would like to present these patches on netdev soon (no
sooner than next week, anyway).
Cheers,
Paolo