Hi,

On Fri, 19 Aug 2016 11:20:40 +0200 Hannes Frederic Sowa 
<han...@stressinduktion.org> wrote:
> >> Maybe we can change our criteria in the following manner:
> >>  
> >> -  if (skb_iif && proto == IPPROTO_UDP) {
> >> +  if (skb_iif && !(df & htons(IP_DF))) {
> >>            IPCB(skb)->flags |= IPSKB_FRAG_SEGS;
> >>
> >> This way, any tunnel explicitly providing DF is NOT allowed to
> >> further fragment the resulting segments (leading to tx segments being
> >> dropped).
> >> Others tunnels, that do not care (e.g. vxlan and geneve, and probably
> >> ovs vport-gre, or other ovs encap vports, in df_default=false mode),
> >> will behave same for gso and non-gso.
> >>
> 
> I am really not sure...
> 
> Probably we have no other choice.

Further diving into this, seems the !IP_DF approach is more correct 
then the IPPROTO_UDP approach (WRT packets/segments arriving from other
interface, that exceed egress mtu):

vxlan/geneve:
  Both set df to zero.
  !IP_DF approach acts same as IPPROTO_UDP approach

vxlan/geneve in collect_md (e.g. OvS):
  They set df according to tun_flags & TUNNEL_DONT_FRAGMENT.
  IPPROTO_UDP approach:
    IPSKB_FRAG_SEGS gets set unconditionally.
    In case TUNNEL_DONT_FRAGMENT requested, non-gso get dropped
    due to IPSTATS_MIB_FRAGFAILS, whereas gso gets segmented+fragmented (!)
  !IP_DF approach:
    Aligned, both non-gso and gso gets dropped for TUNNEL_DONT_FRAGMENT.

ip_gre in collect_md (e.g. OvS):
  Sets df according to tun_flags & TUNNEL_DONT_FRAGMENT.
  IPPROTO_UDP approach:
    IPSKB_FRAG_SEGS is never set.
    Therefore in the case were df is NOT set, non-gso are fragged and
    passed, whereas gso gets dropped (!)
  !IP_DF approach:
    Non-gso vs gso aligned.

ip_gre in nopmtudisc:
  Will pass tnl_update_pmtu checks; Then, df inherrited from inner_iph
  (or stays unset if IFLA_GRE_IGNORE_DF specified).
  IPPROTO_UDP approach:
    IPSKB_FRAG_SEGS never set.
    Therefore in the case were df is NOT set, non-gso are fragged and
    passed, whereas gso gets dropped (!)
  !IP_DF approach:
    Aligned.
    
ip_gre in fou/gue mode in nopmtudisc:
  Assuming they pass tnl_update_pmtu checks; Then, df inherrited from
  inner_iph (or stays unset if IFLA_GRE_IGNORE_DF specified).
  IPPROTO_UDP approach:
    IPSKB_FRAG_SEGS gets always set (since proto==IPPROTO_UDP).
    In the case df is set, non-gso dropped by IPSTATS_MIB_FRAGFAILS,
    whereas gso gets segmented+fragmented (!)
  !IP_DF approach: 
    Aligned.

ip_gre in pmtudisc:
  Sets df to IP_DF.
  Non-gso will fail tnl_update_pmtu checks (gso should pass).
  IPPROTO_UDP approach:
    IPSKB_FRAG_SEGS never set. This leads the gso skbs to be eventually
    dropped. okay.
  !IP_DF approach:
    IPSKB_FRAG_SEGS not set, since IP_DF is true.
    This leads to gso skbs to be eventually dropped. okay.

(truely appreciate if you can review my above analysis)

Therefore using !(df & htons(IP_DF)) actually fixes some oversights of
our former proto==IPPROTO_UDP approach.

I'll send a patch.

Thanks
Shmulik

Reply via email to