Hello, On Wed, 8 Jul 2015 21:51:56 +0300 Timo Teras <timo.te...@iki.fi> wrote:
> On Wed, 08 Jul 2015 19:39:58 +0200 > Hannes Frederic Sowa <han...@stressinduktion.org> wrote: > > > > At least we know which interface the packet would leave. Should we > > override this behavior on a per-interface basis? > > That would be one option. We could also make the exception just for > GRE interface in the DF mode. Or some sort of per-interface flag that > is set internally by the driver. > > > (Although I am in favor of admins just correcting the mtu by hand > > and documenting this as you proposed earlier. I really don't know > > if it is worth the effort to propagate those information.). > > The problem with GRE + DF is that the internal packet is no-DF and > potentially fragmented, but the tunneled packets (fragments) do have > DF set. So no interim router can defrag+frag if needed. This means > pmtu must be honored. And on nbma gre tunnels (target tunnel address > depends on encapsulated packet's address) the path mtu needs to be > propagated back to the sender. > > In this configuration ip_forward_use_pmtu needs to be enabled (system > wide, per-interface config, or implicitly by interface flags). Or > alternatively the trusted pmtu needs to be propagated via alternate > mechanism. But that might be quite tricky to implement. > > I believe other tunnels have similar mechanism. E.g. ipip tunnels > seems to share same DF vs. non-DF mode based on 'ttl' setting. Thinking more on this, the issue happens even without XFRM. XFRM just makes it pretty much immediate to happen. Basically tunnel devices with 'pmtudisc' flag set (default for many, and especially if 'ttl' parameter is used), have the issue always. As the tunnel header has DF always set the routers handling tunnel packets cannot really re-fragment. I believe we should have a flag describing this functionality, and enable ip_forward_use_pmtu for those interface. IFF_XMIT_DST_RELEASE is a prerequisite to update the inner flow pmtu, so it might be used as hint. Though, it's set even if 'nopmtudisc' was configured. So I suppose this would deserve it's own flag. Grepping for drivers calling netif_keep_dst() gives a list of drivers which need to be checked if they rely on this behaviour. Thanks, Timo -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html