On Fri, Feb 26, 2021 at 4:23 PM Daniel Borkmann <dan...@iogearbox.net> wrote: > > We noticed a GRO issue for UDP-based encaps such as vxlan/geneve when the > csum for the UDP header itself is 0. In that case, GRO aggregation does > not take place on the phys dev, but instead is deferred to the vxlan/geneve > driver (see trace below). > > The reason is essentially that GRO aggregation bails out in udp_gro_receive() > for such case when drivers marked the skb with CHECKSUM_UNNECESSARY (ice, > i40e, > others) where for non-zero csums 2abb7cdc0dc8 ("udp: Add support for doing > checksum unnecessary conversion") promotes those skbs to CHECKSUM_COMPLETE > and napi context has csum_valid set. This is however not the case for zero > UDP csum (here: csum_cnt is still 0 and csum_valid continues to be false). > > At the same time 57c67ff4bd92 ("udp: additional GRO support") added matches > on !uh->check ^ !uh2->check as part to determine candidates for aggregation, > so it certainly is expected to handle zero csums in udp_gro_receive(). The > purpose of the check added via 662880f44203 ("net: Allow GRO to use and set > levels of checksum unnecessary") seems to catch bad csum and stop aggregation > right away. > > One way to fix aggregation in the zero case is to only perform the !csum_valid > check in udp_gro_receive() if uh->check is infact non-zero. > > Before: > > [...] > swapper 0 [008] 731.946506: net:netif_receive_skb: dev=enp10s0f0 > skbaddr=0xffff966497100400 len=1500 (1) > swapper 0 [008] 731.946507: net:netif_receive_skb: dev=enp10s0f0 > skbaddr=0xffff966497100200 len=1500 > swapper 0 [008] 731.946507: net:netif_receive_skb: dev=enp10s0f0 > skbaddr=0xffff966497101100 len=1500 > swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 > skbaddr=0xffff966497101700 len=1500 > swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 > skbaddr=0xffff966497101b00 len=1500 > swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 > skbaddr=0xffff966497100600 len=1500 > swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 > skbaddr=0xffff966497100f00 len=1500 > swapper 0 [008] 731.946509: net:netif_receive_skb: dev=enp10s0f0 > skbaddr=0xffff966497100a00 len=1500 > swapper 0 [008] 731.946516: net:netif_receive_skb: dev=enp10s0f0 > skbaddr=0xffff966497100500 len=1500 > swapper 0 [008] 731.946516: net:netif_receive_skb: dev=enp10s0f0 > skbaddr=0xffff966497100700 len=1500 > swapper 0 [008] 731.946516: net:netif_receive_skb: dev=enp10s0f0 > skbaddr=0xffff966497101d00 len=1500 (2) > swapper 0 [008] 731.946517: net:netif_receive_skb: dev=enp10s0f0 > skbaddr=0xffff966497101000 len=1500 > swapper 0 [008] 731.946517: net:netif_receive_skb: dev=enp10s0f0 > skbaddr=0xffff966497101c00 len=1500 > swapper 0 [008] 731.946517: net:netif_receive_skb: dev=enp10s0f0 > skbaddr=0xffff966497101400 len=1500 > swapper 0 [008] 731.946518: net:netif_receive_skb: dev=enp10s0f0 > skbaddr=0xffff966497100e00 len=1500 > swapper 0 [008] 731.946518: net:netif_receive_skb: dev=enp10s0f0 > skbaddr=0xffff966497101600 len=1500 > swapper 0 [008] 731.946521: net:netif_receive_skb: dev=enp10s0f0 > skbaddr=0xffff966497100800 len=774 > swapper 0 [008] 731.946530: net:netif_receive_skb: dev=test_vxlan > skbaddr=0xffff966497100400 len=14032 (1) > swapper 0 [008] 731.946530: net:netif_receive_skb: dev=test_vxlan > skbaddr=0xffff966497101d00 len=9112 (2) > [...] > > # netperf -H 10.55.10.4 -t TCP_STREAM -l 20 > MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to > 10.55.10.4 () port 0 AF_INET : demo > Recv Send Send > Socket Socket Message Elapsed > Size Size Size Time Throughput > bytes bytes bytes secs. 10^6bits/sec > > 87380 16384 16384 20.01 13129.24 > > After: > > [...] > swapper 0 [026] 521.862641: net:netif_receive_skb: dev=enp10s0f0 > skbaddr=0xffff93ab0d479000 len=11286 (1) > swapper 0 [026] 521.862643: net:netif_receive_skb: dev=test_vxlan > skbaddr=0xffff93ab0d479000 len=11236 (1) > swapper 0 [026] 521.862650: net:netif_receive_skb: dev=enp10s0f0 > skbaddr=0xffff93ab0d478500 len=2898 (2) > swapper 0 [026] 521.862650: net:netif_receive_skb: dev=enp10s0f0 > skbaddr=0xffff93ab0d479f00 len=8490 (3) > swapper 0 [026] 521.862653: net:netif_receive_skb: dev=test_vxlan > skbaddr=0xffff93ab0d478500 len=2848 (2) > swapper 0 [026] 521.862653: net:netif_receive_skb: dev=test_vxlan > skbaddr=0xffff93ab0d479f00 len=8440 (3) > [...] > > # netperf -H 10.55.10.4 -t TCP_STREAM -l 20 > MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to > 10.55.10.4 () port 0 AF_INET : demo > Recv Send Send > Socket Socket Message Elapsed > Size Size Size Time Throughput > bytes bytes bytes secs. 10^6bits/sec > > 87380 16384 16384 20.01 24576.53 > > Fixes: 57c67ff4bd92 ("udp: additional GRO support") > Fixes: 662880f44203 ("net: Allow GRO to use and set levels of checksum > unnecessary") > Signed-off-by: Daniel Borkmann <dan...@iogearbox.net> > Cc: Eric Dumazet <eduma...@google.com> > Cc: Willem de Bruijn <will...@google.com> > Cc: John Fastabend <john.fastab...@gmail.com> > Cc: Jesse Brandeburg <jesse.brandeb...@intel.com> > Cc: Tom Herbert <t...@herbertland.com>
Makes sense to me. We cannot do checksum conversion with zero field, but that does not have to limit coalescing. CHECKSUM_COMPLETE with a checksum validated by skb_gro_checksum_validate_zero_check implies csum_valid. So the test > (skb->ip_summed != CHECKSUM_PARTIAL && > NAPI_GRO_CB(skb)->csum_cnt == 0 && > !NAPI_GRO_CB(skb)->csum_valid) || Basically matches - CHECKSUM_NONE - CHECKSUM_UNNECESSARY which has already used up its valid state on a prior header - CHECKSUM_COMPLETE with bad checksum. This change just refines to not drop for in the first two cases on a zero checksum field. Making this explicit in case anyone sees holes in the logic. Else, Acked-by: Willem de Bruijn <will...@google.com>