Hi, On Fri, 19 Aug 2016 11:20:40 +0200 Hannes Frederic Sowa <han...@stressinduktion.org> wrote: > >> Maybe we can change our criteria in the following manner: > >> > >> - if (skb_iif && proto == IPPROTO_UDP) { > >> + if (skb_iif && !(df & htons(IP_DF))) { > >> IPCB(skb)->flags |= IPSKB_FRAG_SEGS; > >> > >> This way, any tunnel explicitly providing DF is NOT allowed to > >> further fragment the resulting segments (leading to tx segments being > >> dropped). > >> Others tunnels, that do not care (e.g. vxlan and geneve, and probably > >> ovs vport-gre, or other ovs encap vports, in df_default=false mode), > >> will behave same for gso and non-gso. > >> > > I am really not sure... > > Probably we have no other choice.
Further diving into this, seems the !IP_DF approach is more correct then the IPPROTO_UDP approach (WRT packets/segments arriving from other interface, that exceed egress mtu): vxlan/geneve: Both set df to zero. !IP_DF approach acts same as IPPROTO_UDP approach vxlan/geneve in collect_md (e.g. OvS): They set df according to tun_flags & TUNNEL_DONT_FRAGMENT. IPPROTO_UDP approach: IPSKB_FRAG_SEGS gets set unconditionally. In case TUNNEL_DONT_FRAGMENT requested, non-gso get dropped due to IPSTATS_MIB_FRAGFAILS, whereas gso gets segmented+fragmented (!) !IP_DF approach: Aligned, both non-gso and gso gets dropped for TUNNEL_DONT_FRAGMENT. ip_gre in collect_md (e.g. OvS): Sets df according to tun_flags & TUNNEL_DONT_FRAGMENT. IPPROTO_UDP approach: IPSKB_FRAG_SEGS is never set. Therefore in the case were df is NOT set, non-gso are fragged and passed, whereas gso gets dropped (!) !IP_DF approach: Non-gso vs gso aligned. ip_gre in nopmtudisc: Will pass tnl_update_pmtu checks; Then, df inherrited from inner_iph (or stays unset if IFLA_GRE_IGNORE_DF specified). IPPROTO_UDP approach: IPSKB_FRAG_SEGS never set. Therefore in the case were df is NOT set, non-gso are fragged and passed, whereas gso gets dropped (!) !IP_DF approach: Aligned. ip_gre in fou/gue mode in nopmtudisc: Assuming they pass tnl_update_pmtu checks; Then, df inherrited from inner_iph (or stays unset if IFLA_GRE_IGNORE_DF specified). IPPROTO_UDP approach: IPSKB_FRAG_SEGS gets always set (since proto==IPPROTO_UDP). In the case df is set, non-gso dropped by IPSTATS_MIB_FRAGFAILS, whereas gso gets segmented+fragmented (!) !IP_DF approach: Aligned. ip_gre in pmtudisc: Sets df to IP_DF. Non-gso will fail tnl_update_pmtu checks (gso should pass). IPPROTO_UDP approach: IPSKB_FRAG_SEGS never set. This leads the gso skbs to be eventually dropped. okay. !IP_DF approach: IPSKB_FRAG_SEGS not set, since IP_DF is true. This leads to gso skbs to be eventually dropped. okay. (truely appreciate if you can review my above analysis) Therefore using !(df & htons(IP_DF)) actually fixes some oversights of our former proto==IPPROTO_UDP approach. I'll send a patch. Thanks Shmulik