On Wed, Apr 18, 2018 at 11:22 AM, Willem de Bruijn <willemdebruijn.ker...@gmail.com> wrote: > On Wed, Apr 18, 2018 at 2:12 PM, Alexander Duyck > <alexander.du...@gmail.com> wrote: >> On Wed, Apr 18, 2018 at 10:28 AM, David Miller <da...@davemloft.net> wrote: >>> From: Sowmini Varadhan <sowmini.varad...@oracle.com> >>> Date: Wed, 18 Apr 2018 08:31:03 -0400 >>> >>>> However, I share Sridhar's concerns about the very fundamental change >>>> to UDP message boundary semantics here. There is actually no such thing >>>> as a "segment" in udp, so in general this feature makes me a little >>>> uneasy. Well behaved udp applications should already be sending mtu >>>> sized datagrams. And the not-so-well-behaved ones are probably relying >>>> on IP fragmentation/reassembly to take care of datagram boundary semantics >>>> for them? >>>> >>>> As Sridhar points out, the feature is not really "negotiated" - one side >>>> unilaterally sets the option. If the receiver is a classic/POSIX UDP >>>> implementation, it will have no way of knowing that message boundaries >>>> have been re-adjusted at the sender. >>> >>> There are no "semantics". >>> >>> What ends up on the wire is the same before the kernel/app changes as >>> afterwards. >>> >>> The only difference is that instead of the application doing N - 1 >>> sendmsg() calls with mtu sized writes, it's giving everything all at >>> once and asking the kernel to segment. >>> >>> It even gives the application control over the size of the packets, >>> which I think is completely prudent in this situation. >> >> My only concern with the patch set is verifying what mitigations are >> in case so that we aren't trying to set an MSS size that results in a >> frame larger than MTU. I'm still digging through the code and trying >> to grok it, but I figured I might just put the question out there to >> may my reviewing easier. > > This is checked in udp_send_skb in > > const int hlen = skb_network_header_len(skb) + > sizeof(struct udphdr); > > if (hlen + cork->gso_size > cork->fragsize) > return -EINVAL; > > At this point cork->fragsize will have been set in ip_setup_cork > based on device or path mtu: > > cork->fragsize = ip_sk_use_pmtu(sk) ? > dst_mtu(&rt->dst) : rt->dst.dev->mtu; > > The feature bypasses the MTU sanity checks in __ip_append_data > by setting local variable mtu to a network layer max > > mtu = cork->gso_size ? IP_MAX_MTU : cork->fragsize; > > but the above should perform the same check, only later. The > check in udp_send_skb is simpler as it does not have to deal > with the case of fragmentation. > >> Also any plans for HW offload support for this? I vaguely recall that >> the igb and ixgbe parts had support for something like this in >> hardware. I would have to double check to see what exactly is >> supported. > > I hadn't given that much thought until the request yesterday to > expose the NETIF_F_GSO_UDP_L4 flag through ethtool. By > virtue of having only a single fixed segmentation length, it > appears reasonably straightforward to offload.
Actually I just got a chance to start on a review of things. Do we need to have to use both GSO_UDP and and GSO_UDP_L4? It might be better if we could split these up and specifically call out GSO_UDP as UFO and GSO_UDP_L4 as being UDP segmentation. My only concern is that I don't think devices would want to have to try and advertise both GSO_UDP and GSO_UDP_L4 if they want to support UDP segmentation only. Ideally we want to keep UFO separate from UDP segmentation since they would be two different things. Thanks. - Alex