Hi, On Mon, 2017-12-18 at 12:11 +0800, zhangliping wrote: > From: zhangliping <zhanglipin...@baidu.com> > > Under our udp pressure performance test, after gro is disabled, rx rate > will be improved from ~2500kpps to ~2800kpps. We can find some difference > from perf report: > 1. gro is enabled: > 24.23% [kernel] [k] udp4_lib_lookup2 > 5.42% [kernel] [k] __memcpy > 3.87% [kernel] [k] fib_table_lookup > 3.76% [kernel] [k] __netif_receive_skb_core > 3.68% [kernel] [k] ip_rcv > > 2. gro is disabled: > 9.66% [kernel] [k] udp4_lib_lookup2 > 9.47% [kernel] [k] __memcpy > 4.75% [kernel] [k] fib_table_lookup > 4.71% [kernel] [k] __netif_receive_skb_core > 3.90% [kernel] [k] virtnet_poll > > So if there's no udp tunnel(such as vxlan) configured, we can skip > the udp gro processing.
I tested something similar some time ago, but I measured a much smaller gain. Also the topmost perf offenders looks quite different from what I see here, can you please share more details about the test case? > Signed-off-by: zhangliping <zhanglipin...@baidu.com> > --- > include/net/udp.h | 2 ++ > net/ipv4/udp_offload.c | 7 +++++++ > net/ipv4/udp_tunnel.c | 11 ++++++++++- > 3 files changed, 19 insertions(+), 1 deletion(-) > > diff --git a/include/net/udp.h b/include/net/udp.h > index 6c759c8594e2..c503f8b06845 100644 > --- a/include/net/udp.h > +++ b/include/net/udp.h > @@ -188,6 +188,8 @@ static inline struct udphdr *udp_gro_udphdr(struct > sk_buff *skb) > return uh; > } > > +extern struct static_key_false udp_gro_needed; > + > /* hash routines shared between UDPv4/6 and UDP-Litev4/6 */ > static inline int udp_lib_hash(struct sock *sk) > { > diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c > index 01801b77bd0d..9cb11a833964 100644 > --- a/net/ipv4/udp_offload.c > +++ b/net/ipv4/udp_offload.c > @@ -10,10 +10,14 @@ > * UDPv4 GSO support > */ > > +#include <linux/static_key.h> > #include <linux/skbuff.h> > #include <net/udp.h> > #include <net/protocol.h> > > +DEFINE_STATIC_KEY_FALSE(udp_gro_needed); > +EXPORT_SYMBOL_GPL(udp_gro_needed); > + I think that adding a new static key is not required, as we can probably reuse 'udp_encap_needed' and 'udpv6_encap_needed'. The latter choice allows earlier branching (in udp4_gro_receive()/udp6_gro_receive() instead of udp_gro_receive(). Cheers, Paolo