On Mon, 2021-03-29 at 08:28 -0400, Willem de Bruijn wrote: > On Mon, Mar 29, 2021 at 7:26 AM Paolo Abeni <pab...@redhat.com> wrote: > > On Fri, 2021-03-26 at 14:30 -0400, Willem de Bruijn wrote: > > > On Thu, Mar 25, 2021 at 1:24 PM Paolo Abeni <pab...@redhat.com> wrote: > > > > When UDP packets generated locally by a socket with UDP_SEGMENT > > > > traverse the following path: > > > > > > > > UDP tunnel(xmit) -> veth (segmentation) -> veth (gro) -> > > > > UDP tunnel (rx) -> UDP socket (no UDP_GRO) > > > > > > > > they are segmented as part of the rx socket receive operation, and > > > > present a CHECKSUM_NONE after segmentation. > > > > > > would be good to capture how this happens, as it was not immediately > > > obvious. > > > > The CHECKSUM_PARTIAL is propagated up to the UDP tunnel processing, > > where we have: > > > > __iptunnel_pull_header() -> skb_pull_rcsum() -> > > skb_postpull_rcsum() -> __skb_postpull_rcsum() and the latter do the > > conversion. > > Please capture this in the commit message.
I will do. > > > > Additionally the segmented packets UDP CB still refers to the original > > > > GSO packet len. Overall that causes unexpected/wrong csum validation > > > > errors later in the UDP receive path. > > > > > > > > We could possibly address the issue with some additional checks and > > > > csum mangling in the UDP tunnel code. Since the issue affects only > > > > this UDP receive slow path, let's set a suitable csum status there. > > > > > > > > v1 -> v2: > > > > - restrict the csum update to the packets strictly needing them > > > > - hopefully clarify the commit message and code comments > > > > > > > > Signed-off-by: Paolo Abeni <pab...@redhat.com> > > > > + if (skb->ip_summed == CHECKSUM_NONE && !skb->csum_valid) > > > > + skb->csum_valid = 1; > > > > > > Not entirely obvious is that UDP packets arriving on a device with rx > > > checksum offload off, i.e., with CHECKSUM_NONE, are not matched by > > > this test. > > > > > > I assume that such packets are not coalesced by the GRO layer in the > > > first place. But I can't immediately spot the reason for it.. > > > > Packets with CHECKSUM_NONE are actually aggregated by the GRO engine. > > > > Their checksum is validated by: > > > > udp4_gro_receive -> skb_gro_checksum_validate_zero_check() > > -> __skb_gro_checksum_validate -> > > __skb_gro_checksum_validate_complete() > > > > and skb->ip_summed is changed to CHECKSUM_UNNECESSARY by: > > > > __skb_gro_checksum_validate -> skb_gro_incr_csum_unnecessary > > -> __skb_incr_checksum_unnecessary() > > > > and finally to CHECKSUM_PARTIAL by: > > > > udp4_gro_complete() -> udp_gro_complete() -> udp_gro_complete_segment() > > > > Do you prefer I resubmit with some more comments, either in the commit > > message or in the code? > > That breaks the checksum-and-copy optimization when delivering to > local sockets. I wonder if that is a regression. The conversion to CHECKSUM_UNNECESSARY happens since commit 573e8fca255a27e3573b51f9b183d62641c47a3d. Even the conversion to CHECKSUM_PARTIAL happens independently from this series, since commit 6f1c0ea133a6e4a193a7b285efe209664caeea43. I don't see a regression here ?!? Thanks! Paolo