Thanks a lot for checking, Itai!

As discussed offline, we made another attempt with 5.19.0-28-generic
kernel and 22.35.2302 firmware on a different system, and also did not
run into this issue there.

Will set this to incomplete until we regain access to the system where
this was first observed so we can compare sw/hw components.

** Changed in: linux (Ubuntu)
       Status: Confirmed => Incomplete

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1999229

Title:
  mlx5 VF LAG flapping

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  # sudo lsb_release -a
  No LSB modules are available.
  Distributor ID:       Ubuntu
  Description:  Ubuntu 22.04.1 LTS
  Release:      22.04
  Codename:     jammy

  # mlxfwmanager 
  Querying Mellanox devices firmware ...

  Device #1:
  ----------

    Device Type:      ConnectX6DX
    Part Number:      MCX623106AN-CDA_Ax
    Description:      ConnectX-6 Dx EN adapter card; 100GbE; Dual-port QSFP56; 
PCIe 4.0/3.0 x16;
    PSID:             MT_0000000359
    PCI Device Name:  0000:41:00.0
    Base GUID:        08c0eb03006fb26e
    Base MAC:         08c0eb6fb26e
    Versions:         Current        Available     
       FW             22.34.4000     N/A           
       PXE            3.6.0700       N/A           
       UEFI           14.27.0015     N/A           

  # uname -a
  Linux ps6-ra1-n2 5.19.0-24-generic #25~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri 
Nov 18 14:28:08 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

  Kernel from linux-generic-hwe-22.04-edge package in jammy-proposed,
  see https://wiki.ubuntu.com/Testing/EnableProposed for documentation
  how to enable and use -proposed.

  Problem:
  Severe packet loss to high speed NIC due to what appears as VF LAG flapping:
  [Fri Dec  9 07:27:19 2022] mlx5_core 0000:41:00.0: mlx5_cmd_out_err:778:(pid 
3383): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad 
resource(0x5), syndrome (0xf2ff71), err(-22)
  [Fri Dec  9 07:27:19 2022] mlx5_core 0000:41:00.0: E-Switch: Failed to create 
termination table rule, err -EINVAL
  [Fri Dec  9 07:27:19 2022] mlx5_core 0000:41:00.0: E-Switch: Failed to get 
termination table, err -EINVAL
  [Fri Dec  9 07:27:19 2022] mlx5_core 0000:41:00.1: mlx5_cmd_out_err:778:(pid 
3383): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad 
resource(0x5), syndrome (0xf2ff71), err(-22)
  [Fri Dec  9 07:27:19 2022] mlx5_core 0000:41:00.1: E-Switch: Failed to create 
termination table rule, err -EINVAL
  [Fri Dec  9 07:27:19 2022] mlx5_core 0000:41:00.1: E-Switch: Failed to get 
termination table, err -EINVAL
  [Fri Dec  9 07:27:20 2022] mlx5_core 0000:41:00.0: lag map active ports: 1, 2
  [Fri Dec  9 07:27:20 2022] mlx5_core 0000:41:00.0: lag map active ports: 2
  [Fri Dec  9 07:27:20 2022] mlx5_core 0000:41:00.0: lag map active ports: 1, 2
  [Fri Dec  9 07:27:21 2022] mlx5_core 0000:41:00.0: lag map active ports: 2
  [Fri Dec  9 07:27:21 2022] mlx5_core 0000:41:00.0: lag map active ports: 1, 2
  [Fri Dec  9 07:27:22 2022] mlx5_core 0000:41:00.0: lag map active ports: 2
  [Fri Dec  9 07:27:23 2022] mlx5_core 0000:41:00.0: lag map active ports: 1, 2
  [Fri Dec  9 07:27:23 2022] mlx5_core 0000:41:00.0: lag map active ports: 2

  This does not happen when using the Jammy 5.15 kernel, everything else
  in the environment being equal.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1999229/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to