I've re-executed the test plan on Mellanox ConnectX-6 Dx (MT2892). As there seems to be some issue with this hardware generation (see bug #2020409 comment #11++). And indeed, I seem to be able to reproduce that failure, the devices are not set to "switchdev" mode and the VF-LAG is not activated:
ubuntu@romano:~$ sudo lshw -c network -businfo Bus info Device Class Description ============================================================ pci@0000:21:00.0 ens13f0np0 network BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller pci@0000:21:00.1 ens13f1np1 network BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller pci@0000:61:00.0 ens7f0 network MT2892 Family [ConnectX-6 Dx] pci@0000:61:00.1 ens7f1 network MT2892 Family [ConnectX-6 Dx] ubuntu@romano:~$ sudo devlink dev eswitch show pci/0000:61:00.0 kernel answers: Operation not supported ubuntu@romano:~$ sudo devlink dev eswitch show pci/0000:61:00.1 kernel answers: Operation not supported ubuntu@romano:~$ sudo apt-get install --install-recommends linux-generic-hwe-22.04 # reboot ubuntu@romano:~$ uname -a Linux romano 6.8.0-52-generic #53~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Wed Jan 15 19:18:46 UTC 2 x86_64 x86_64 x86_64 GNU/Linux ubuntu@romano:~$ sudo apt install -t jammy-proposed netplan.io ubuntu@romano:~$ apt list *netplan* Listing... Done libnetplan-dev/jammy-proposed 0.107.1-3ubuntu0.22.04.2 amd64 libnetplan0/jammy-proposed,now 0.107.1-3ubuntu0.22.04.2 amd64 [installed,automatic] netplan-generator/jammy-proposed,now 0.107.1-3ubuntu0.22.04.2 amd64 [installed,automatic] netplan.io/jammy-proposed,now 0.107.1-3ubuntu0.22.04.2 amd64 [installed] python3-netplan/jammy-proposed,now 0.107.1-3ubuntu0.22.04.2 amd64 [installed,automatic] ubuntu@romano:~$ sudo cat /sys/kernel/debug/mlx5/0000:61:00.0/lag/state disabled ubuntu@romano:~$ sudo cat /sys/kernel/debug/mlx5/0000:61:00.1/lag/state disabled ubuntu@romano:~$ sudo netplan get network: version: 2 ethernets: ens13f0np0: match: macaddress: "84:16:0c:3d:63:ce" addresses: - "10.241.7.26/24" nameservers: addresses: - 10.239.8.12 - 10.239.8.13 - 10.239.8.11 - 10.176.2.4 - 10.176.2.2 - 10.176.2.3 search: - maas - dh1-j8-1.tor3-sqa-shared-maas.solutionsqa - dh1-j8-2.tor3-sqa-shared-maas.solutionsqa - dh1-j9-1.tor3-sqa-shared-maas.solutionsqa - dh1-j9-2.tor3-sqa-shared-maas.solutionsqa gateway4: 10.241.7.1 set-name: "ens13f0np0" mtu: 1500 ens13f1np1: match: macaddress: "84:16:0c:3d:63:cf" set-name: "ens13f1np1" mtu: 1500 ens7f0: match: macaddress: "b8:3f:d2:2d:68:7e" optional: true set-name: "ens7f0" mtu: 1500 virtual-function-count: 8 embedded-switch-mode: "switchdev" delay-virtual-functions-rebind: true ens7f1: match: macaddress: "b8:3f:d2:2d:68:7f" set-name: "ens7f1" mtu: 1500 virtual-function-count: 8 embedded-switch-mode: "switchdev" delay-virtual-functions-rebind: true bonds: bond0: interfaces: - ens7f0 - ens7f1 parameters: mode: "active-backup" # reboot ## FAILURE ubuntu@romano:~$ sudo lshw -c network -businfo Bus info Device Class Description ============================================================ pci@0000:21:00.0 ens13f0np0 network BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller pci@0000:21:00.1 ens13f1np1 network BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller pci@0000:61:00.0 ens7f0 network MT2892 Family [ConnectX-6 Dx] pci@0000:61:00.1 ens7f1 network MT2892 Family [ConnectX-6 Dx] pci@0000:61:00.2 ens7f0v0 network ConnectX Family mlx5Gen Virtual Function pci@0000:61:00.3 ens7f0v1 network ConnectX Family mlx5Gen Virtual Function pci@0000:61:00.4 ens7f0v2 network ConnectX Family mlx5Gen Virtual Function pci@0000:61:00.5 ens7f0v3 network ConnectX Family mlx5Gen Virtual Function pci@0000:61:00.6 ens7f0v4 network ConnectX Family mlx5Gen Virtual Function pci@0000:61:00.7 ens7f0v5 network ConnectX Family mlx5Gen Virtual Function pci@0000:61:01.0 ens7f0v6 network ConnectX Family mlx5Gen Virtual Function pci@0000:61:01.1 ens7f0v7 network ConnectX Family mlx5Gen Virtual Function ubuntu@romano:~$ sudo cat /sys/kernel/debug/mlx5/0000:61:00.0/lag/state disabled ubuntu@romano:~$ sudo cat /sys/kernel/debug/mlx5/0000:61:00.1/lag/state disabled ubuntu@romano:~$ sudo devlink dev eswitch show pci/0000:61:00.0 pci/0000:61:00.0: mode legacy inline-mode none encap-mode basic ubuntu@romano:~$ sudo devlink dev eswitch show pci/0000:61:00.1 pci/0000:61:00.1: mode legacy inline-mode none encap-mode basic ** Tags added: block-proposed-jammy -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1988018 Title: [SRU][mlx5] Intermittent VF-LAG activation failure To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1988018/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs