On Fri, Dec 18, 2015 at 11:34 AM, Vijay Pandurangan <vij...@vijayp.ca> wrote: > Packets that arrive from real hardware devices have ip_summed == > CHECKSUM_UNNECESSARY if the hardware verified the checksums, or > CHECKSUM_NONE if the packet is bad or it was unable to verify it. The > current version of veth will replace CHECKSUM_NONE with > CHECKSUM_UNNECESSARY, which causes corrupt packets routed from hardware to > a veth device to be delivered to the application. This caused applications > at Twitter to receive corrupt data when network hardware was corrupting > packets. > > We believe this was added as an optimization to skip computing and > verifying checksums for communication between containers. However, locally > generated packets have ip_summed == CHECKSUM_PARTIAL, so the code as > written does nothing for them. As far as we can tell, after removing this > code, these packets are transmitted from one stack to another unmodified > (tcpdump shows invalid checksums on both sides, as expected), and they are > delivered correctly to applications. We didn’t test every possible network > configuration, but we tried a few common ones such as bridging containers, > using NAT between the host and a container, and routing from hardware > devices to containers. We have effectively deployed this in production at > Twitter (by disabling RX checksum offloading on veth devices). > > This code dates back to the first version of the driver, commit > <e314dbdc1c0dc6a548ecf> ("[NET]: Virtual ethernet device driver"), so I > suspect this bug occurred mostly because the driver API has evolved > significantly since then. Commit <0b7967503dc97864f283a> ("net/veth: Fix > packet checksumming") (in December 2010) fixed this for packets that get > created locally and sent to hardware devices, by not changing > CHECKSUM_PARTIAL. However, the same issue still occurs for packets coming > in from hardware devices. > > Co-authored-by: Evan Jones <e...@evanjones.ca> > Signed-off-by: Evan Jones <e...@evanjones.ca> > Cc: Nicolas Dichtel <nicolas.dich...@6wind.com> > Cc: Phil Sutter <p...@nwl.cc> > Cc: Toshiaki Makita <makita.toshi...@lab.ntt.co.jp> > Cc: netdev@vger.kernel.org > Cc: linux-ker...@vger.kernel.org > Signed-off-by: Vijay Pandurangan <vij...@vijayp.ca>
Acked-by: Cong Wang <cw...@twopensource.com> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html