Thanks a lot for checking, Itai! As discussed offline, we made another attempt with 5.19.0-28-generic kernel and 22.35.2302 firmware on a different system, and also did not run into this issue there.
Will set this to incomplete until we regain access to the system where this was first observed so we can compare sw/hw components. ** Changed in: linux (Ubuntu) Status: Confirmed => Incomplete -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1999229 Title: mlx5 VF LAG flapping Status in linux package in Ubuntu: Incomplete Bug description: # sudo lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 22.04.1 LTS Release: 22.04 Codename: jammy # mlxfwmanager Querying Mellanox devices firmware ... Device #1: ---------- Device Type: ConnectX6DX Part Number: MCX623106AN-CDA_Ax Description: ConnectX-6 Dx EN adapter card; 100GbE; Dual-port QSFP56; PCIe 4.0/3.0 x16; PSID: MT_0000000359 PCI Device Name: 0000:41:00.0 Base GUID: 08c0eb03006fb26e Base MAC: 08c0eb6fb26e Versions: Current Available FW 22.34.4000 N/A PXE 3.6.0700 N/A UEFI 14.27.0015 N/A # uname -a Linux ps6-ra1-n2 5.19.0-24-generic #25~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri Nov 18 14:28:08 UTC 2 x86_64 x86_64 x86_64 GNU/Linux Kernel from linux-generic-hwe-22.04-edge package in jammy-proposed, see https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Problem: Severe packet loss to high speed NIC due to what appears as VF LAG flapping: [Fri Dec 9 07:27:19 2022] mlx5_core 0000:41:00.0: mlx5_cmd_out_err:778:(pid 3383): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad resource(0x5), syndrome (0xf2ff71), err(-22) [Fri Dec 9 07:27:19 2022] mlx5_core 0000:41:00.0: E-Switch: Failed to create termination table rule, err -EINVAL [Fri Dec 9 07:27:19 2022] mlx5_core 0000:41:00.0: E-Switch: Failed to get termination table, err -EINVAL [Fri Dec 9 07:27:19 2022] mlx5_core 0000:41:00.1: mlx5_cmd_out_err:778:(pid 3383): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad resource(0x5), syndrome (0xf2ff71), err(-22) [Fri Dec 9 07:27:19 2022] mlx5_core 0000:41:00.1: E-Switch: Failed to create termination table rule, err -EINVAL [Fri Dec 9 07:27:19 2022] mlx5_core 0000:41:00.1: E-Switch: Failed to get termination table, err -EINVAL [Fri Dec 9 07:27:20 2022] mlx5_core 0000:41:00.0: lag map active ports: 1, 2 [Fri Dec 9 07:27:20 2022] mlx5_core 0000:41:00.0: lag map active ports: 2 [Fri Dec 9 07:27:20 2022] mlx5_core 0000:41:00.0: lag map active ports: 1, 2 [Fri Dec 9 07:27:21 2022] mlx5_core 0000:41:00.0: lag map active ports: 2 [Fri Dec 9 07:27:21 2022] mlx5_core 0000:41:00.0: lag map active ports: 1, 2 [Fri Dec 9 07:27:22 2022] mlx5_core 0000:41:00.0: lag map active ports: 2 [Fri Dec 9 07:27:23 2022] mlx5_core 0000:41:00.0: lag map active ports: 1, 2 [Fri Dec 9 07:27:23 2022] mlx5_core 0000:41:00.0: lag map active ports: 2 This does not happen when using the Jammy 5.15 kernel, everything else in the environment being equal. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1999229/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp