On Sun, 12 Jul 2020 22:07:02 +0200
Florian Westphal <f...@strlen.de> wrote:

> There are existing deployments where a vxlan or geneve interface is part
> of a bridge.
> 
> In this case, MTU may look like this:
> 
> bridge mtu: 1450
> vxlan (bridge port) mtu: 1450
> other bridge ports: 1450
> 
> physical link (used by vxlan) mtu: 1500.
> 
> This makes sure that vxlan overhead (50 bytes) doesn't bring packets over the
> 1500 MTU of the physical link.
> 
> Unfortunately, in some cases, PMTU updates on the encap socket
> can bring such setups into a non-working state: no traffic will pass
> over the vxlan port (physical link) anymore.
> Because of the bridge-based usage of the vxlan interface, the original
> sender never learns of the change in path mtu and TCP clients will retransmit
> the over-sized packets until timeout.
> 
> 
> When this happens, a 'ip route flush cache' in the netns holding
> the vxlan interface resolves the problem, i.e. the network is capable
> of transporting the packets and the PMTU update is bogus.
> 
> Another workaround is to enable 'net.ipv4.tcp_mtu_probing'.
> 
> This patch series allows to configure vxlan and geneve interfaces
> to ignore path mtu updates.

Regardless of the comments to 1/3, I don't have any problem with this
(didn't review yet) if it's the only way to currently work around the
issue (of course :)).

I think we should eventually fix PMTU discovery for bridged setups, but
perhaps it's more complicated than that.

I wonder, though:

- wouldn't setting /proc/sys/net/ipv4/ip_no_pmtu_disc have the same
  effect?

- does it really make sense to have this configurable for IPv6?

-- 
Stefano

Reply via email to