On Tue, Jan 26, 2021 at 5:00 PM Willem de Bruijn <willemdebruijn.ker...@gmail.com> wrote: > > On Tue, Jan 26, 2021 at 4:54 PM Willem de Bruijn > <willemdebruijn.ker...@gmail.com> wrote: > > > > On Tue, Jan 26, 2021 at 9:58 AM Oliver Graute <oliver.gra...@gmail.com> > > wrote: > > > > > > Hello, > > > > > > we observe some unexpected behavior in the UDP implementation of the > > > linux kernel. > > > > > > Some UDP packets send via the loopback interface are dropped in the > > > kernel on the receive side when using sendto with the MSG_MORE flag. > > > Every drop increases the InCsumErrors in /proc/self/net/snmp. Some > > > example code to reproduce it is appended below. > > > > > > In the code we tracked it down to this code section. ( Even a little > > > further but its unclear to me wy the csum() is wrong in the bad case) > > > > > > udpv6_recvmsg() > > > ... > > > if (checksum_valid || udp_skb_csum_unnecessary(skb)) { > > > if (udp_skb_is_linear(skb)) > > > err = copy_linear_skb(skb, copied, off, > > > &msg->msg_iter); > > > else > > > err = skb_copy_datagram_msg(skb, off, msg, > > > copied); > > > } else { > > > err = skb_copy_and_csum_datagram_msg(skb, off, msg); > > > if (err == -EINVAL) { > > > goto csum_copy_err; > > > } > > > } > > > ... > > > > > > > Thanks for the report with a full reproducer. > > > > I don't have a full answer yet, but can reproduce this easily. > > > > The third program, without MSG_MORE, builds an skb with > > CHECKSUM_PARTIAL in __ip_append_data. When looped to the receive path > > that ip_summed means no additional validation is needed. As encoded in > > skb_csum_unnecessary. > > > > The first and second programs are essentially the same, bar for a > > slight difference in length. In both cases packet length is very short > > compared to the loopback device MTU. Because of MSG_MORE, these > > packets have CHECKSUM_NONE. > > > > On receive in > > > > __udp4_lib_rcv() > > udp4_csum_init() > > err = skb_checksum_init_zero_check() > > > > The second program validates and sets ip_summed = CHECKSUM_COMPLETE > > and csum_valid = 1. > > The first does not, though err == 0. > > > > This appears to succeed consistently for packets <= 68B of payload, > > fail consistently otherwise. It is not clear to me yet what causes > > this distinction. > > This is from > > " > /* For small packets <= CHECKSUM_BREAK perform checksum complete directly > * in checksum_init. > */ > #define CHECKSUM_BREAK 76 > " > > So the small packet gets checksummed immediately in > __skb_checksum_validate_complete, but the larger one does not. > > Question is why the copy_and_checksum you pointed to seems to fail checksum.
Manually calling __skb_checksum_complete(skb) in skb_copy_and_csum_datagram_msg succeeds, so it is the skb_copy_and_csum_datagram that returns an incorrect csum. Bisection shows that this is a regression in 5.0, between 65d69e2505bb datagram: introduce skb_copy_and_hash_datagram_iter helper (fail) d05f443554b3 iov_iter: introduce hash_and_copy_to_iter helper 950fcaecd5cc datagram: consolidate datagram copy to iter helpers cb002d074dab iov_iter: pass void csum pointer to csum_and_copy_to_iter (pass) That's a significant amount of code change. I'll take a closer look, but checkpointing state for now..