Guillaume Nault <gna...@redhat.com> writes:

> On Mon, Jul 14, 2025 at 09:57:52PM +0200, Salvatore Bonaccorso wrote:
>> Hi,
>> 
>> Charles Bordet reported the following issue (full context in
>> https://bugs.debian.org/1108860)
>> 
>> > Dear Maintainer,
>> > 
>> > What led up to the situation?
>> > We run a production environment using Debian 12 VMs, with a network
>> > topology involving VXLAN tunnels encapsulated inside Wireguard
>> > interfaces. This setup has worked reliably for over a year, with MTU set
>> > to 1500 on all interfaces except the Wireguard interface (set to 1420).
>> > Wireguard kernel fragmentation allowed this configuration to function
>> > without issues, even though the effective path MTU is lower than 1500.
>> > 
>> > What exactly did you do (or not do) that was effective (or ineffective)?
>> > We performed a routine system upgrade, updating all packages include the
>> > kernel. After the upgrade, we observed severe network issues (timeouts,
>> > very slow HTTP/HTTPS, and apt update failures) on all VMs behind the
>> > router. SSH and small-packet traffic continued to work.
>> > 
>> > To diagnose, we:
>> > 
>> > * Restored a backup (with the previous kernel): the problem disappeared.
>> > * Repeated the upgrade, confirming the issue reappeared.
>> > * Systematically tested each kernel version from 6.1.124-1 up to
>> > 6.1.140-1. The problem first appears with kernel 6.1.135-1; all earlier
>> > versions work as expected.
>> > * Kernel version from the backports (6.12.32-1) did not resolve the
>> > problem.
>> > 
>> > What was the outcome of this action?
>> > 
>> > * With kernel 6.1.135-1 or later, network timeouts occur for
>> > large-packet protocols (HTTP, apt, etc.), while SSH and small-packet
>> > protocols work.
>> > * With kernel 6.1.133-1 or earlier, everything works as expected.
>> > 
>> > What outcome did you expect instead?
>> > We expected the network to function as before, with Wireguard handling
>> > fragmentation transparently and no application-level timeouts,
>> > regardless of the kernel version.
>> 
>> While triaging the issue we found that the commit 8930424777e4
>> ("tunnels: Accept PACKET_HOST in skb_tunnel_check_pmtu()." introduces
>> the issue and Charles confirmed that the issue was present as well in
>> 6.12.35 and 6.15.4 (other version up could potentially still be
>> affected, but we wanted to check it is not a 6.1.y specific
>> regression).
>> 
>> Reverthing the commit fixes Charles' issue.
>> 
>> Does that ring a bell?
>
> It doesn't ring a bell. Do you have more details on the setup that has
> the problem? Or, ideally, a self-contained reproducer?

+1 - I tested this patch with an OVS setup using vxlan and geneve
tunnels.  A reproducer or more details would help.

>> Regards,
>> Salvatore
>> 

Reply via email to