On Tue, 2016-12-06 at 22:47 -0800, Eric Dumazet wrote: > On Tue, 2016-12-06 at 19:32 -0800, Eric Dumazet wrote: > > A follow up patch will provide a static_key (Jump Label) since most > > hosts do not even use RFS. > > Speaking of static_key, it appears we now have GRO on UDP, and this > consumes a considerable amount of cpu cycles. > > Turning off GRO allows me to get +20 % more packets on my single UDP > socket. (1.2 Mpps instead of 1.0 Mpps)
I see also an improvement for single flow tests disabling GRO, but on a smaller scale (~5% if I recall correctly). > Surely udp_gro_receive() should be bypassed if no UDP socket has > registered a udp_sk(sk)->gro_receive handler > > And/or delay the inet_add_offload(&udpv{4|6}_offload, IPPROTO_UDP); to > the first UDP sockets setting udp_sk(sk)->gro_receive handler, > ie udp_encap_enable() and udpv6_encap_enable() I had some patches adding explicit static keys for udp_gro_receive, but they were ugly and I did not get that much gain (I measured ~1-2% skipping udp_gro_receive only). I can try to refresh them anyway. We have some experimental patches to implement GRO for plain UDP connected sockets, using frag_list to preserve the individual skb len, and deliver the packet to user space individually. With that I got ~3mpps with a single queue/user space sink - before the recent udp improvements. I would like to present these patches on netdev soon (no sooner than next week, anyway). Cheers, Paolo