Hi Jeff,

upstream commit
50b2412b7e78 net/mlx5: Avoid possible free of command entry while timeout comp 
handler 
was picked to Ubuntu-5.4.0-56.62 kernel 
(hash bcd6e98bef76cc8a49a1b736b0fefffbffb75c30)
(v5.4.71 upstream stable release, 
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1902110 )

now a new issue arise 
reloading mlx5 modules causes an error message in kernel buffer 
"cmd_work_handler:887:(pid 292): failed to allocate command entry"

reproduction:
# modprobe -r mlx5_ib mlx5_core
# modprobe mlx5_core mlx5_ib
# dmesg
[  142.638490] mlx5_core 0000:08:00.1: E-Switch: cleanup
[  143.734339] mlx5_core 0000:08:00.0: E-Switch: cleanup
[  164.171511] mlx5_core: unknown parameter 'mlx5_ib' ignored
[  164.173501] mlx5_core 0000:08:00.0: firmware version: 16.28.1002
[  164.173576] mlx5_core 0000:08:00.0: 126.016 Gb/s available PCIe bandwidth (8 
GT/s x16 link)
[  164.457342] mlx5_core 0000:08:00.0: Rate limit: 127 rates are supported, 
range: 0Mbps to 97656Mbps
[  164.457365] mlx5_core 0000:08:00.0: E-Switch: Total vports 2, per vport: max 
uc(1024) max mc(16384)
[  164.484659] port_module: 5 callbacks suppressed
[  164.484665] mlx5_core 0000:08:00.0: Port module event: module 0, Cable 
plugged
[  164.485112] mlx5_core 0000:08:00.0: mlx5_pcie_event:294:(pid 8): PCIe slot 
advertised sufficient power (75W).
[  164.494771] mlx5_core 0000:08:00.1: firmware version: 16.28.1002
[  164.494844] mlx5_core 0000:08:00.1: 126.016 Gb/s available PCIe bandwidth (8 
GT/s x16 link)
[  164.779534] mlx5_core 0000:08:00.1: Rate limit: 127 rates are supported, 
range: 0Mbps to 97656Mbps
[  164.779552] mlx5_core 0000:08:00.1: E-Switch: Total vports 2, per vport: max 
uc(1024) max mc(16384)
[  164.808886] mlx5_core 0000:08:00.1: Port module event: module 1, Cable 
plugged
[  164.809228] mlx5_core 0000:08:00.1: mlx5_pcie_event:294:(pid 292): PCIe slot 
advertised sufficient power (75W).
[  164.840667] mlx5_core 0000:08:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) 
RxCqeCmprss(0)
[  165.081342] mlx5_core 0000:08:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) 
RxCqeCmprss(0)
[  165.282793] mlx5_ib: Mellanox Connect-IB Infiniband driver v5.0-0
[  165.438226] mlx5_core 0000:08:00.0: cmd_work_handler:887:(pid 292): failed 
to allocate command entry
[  165.442506] infiniband rocep8s0f0: reg_mr_callback:104:(pid 292): async reg 
mr failed. status -11
#  
 
the following fixes this issue
410bd754cd73 net/mlx5: Add retry mechanism to the command entry index 
allocation       (upstream 5.9)
1d5558b1f0de net/mlx5: poll cmd EQ in case of command timeout                   
       (upstream 5.9)
d43b7007dbd1 net/mlx5: Fix a race when moving command interface to events mode  
       (upstream 5.7-rc7)
3ed879965cc4 net/mlx5: net/mlx5: Use async EQ setup cleanup helpers for 
multiple EQs   (upstream 5.6-rc1)

those are on master-next branch off focal tree also synced from linux stable. 
(v5.4.79 upstream stable release 
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907151 )

# git log --oneline  Ubuntu-5.4.0-59.65..master-next
....
400ec5bb2816 net/mlx5: Add retry mechanism to the command entry index allocation
2bd608898edd net/mlx5: Fix a race when moving command interface to events mode
bec07c488db0 net/mlx5: poll cmd EQ in case of command timeout
0c9bfdf598e1 net/mlx5: Use async EQ setup cleanup helpers for multiple EQs
.....

I compiled master-next, booted the system with it and the issue is
resolved.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1905574

Title:
  Ubuntu 20.10 four needed fixes to 'Add driver for Mellanox Connect-IB
  adapters'

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1905574/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to