David Ahern <dsah...@gmail.com> wrote: > On 7/13/20 2:04 AM, Florian Westphal wrote: > >> As PMTU discovery happens, we have a route exception on the lower > >> layer for the given path, and we know that VXLAN will use that path, > >> so we also know there's no point in having a higher MTU on the VXLAN > >> device, it's really the maximum packet size we can use. > > No, in the setup that prompted this series the route exception is wrong. > > Why is the exception wrong and why can't the exception code be fixed to > include tunnel headers?
I don't know. This occurs in a 3rd party (read: "cloud") environment. After some days, tcp connections on the overlay network hang. Flushing the route exception in the namespace of the vxlan interface makes the traffic flow again, i.e. if the vxlan tunnel would just use the physical devices MTU things would be fine. I don't know what you mean by 'fix exception code to include tunnel headers'. Can you elaborate? AFAICS everyhing functions as designed, except: 1. The route exception should not exist in first place in this case 2. The route exception never times out (gets refreshed every time tunnel tries to send a mtu-sized packet). 3. The original sender never learns about the pmtu event Regarding 3) I had cooked up patches to inject a new ICMP error into the bridge input path from vxlan_err_lookup() to let the sender know the path MTU reduction. Unfortunately it only works with Linux bridge (openvswitch tosses the packet). Also, too many (internal) reviews told me they consider this an ugly hack, so I am not too keen on continuing down that route: https://git.breakpoint.cc/cgit/fw/net-next.git/commit/?h=udp_tun_pmtud_12&id=ca5b0af203b6f8010f1e585850620db4561baae7