On 7/9/2020 3:36 PM, Niklas Schnelle wrote: > > On 7/8/20 5:44 PM, Parav Pandit wrote: > ... snip .. >>> >> >> It is likely because events_cleanup() freed the memory using kvfree() that >> health recovery context is trying to access in notifier chain. >> >> While reviewing I see few more errors as below. >> (a) mlx5_pci_err_detected() invokes mlx5_drain_health_wq() outside of >> intf_state_mutex. >> (b) mlx5_enter_error_state() in commit b6e0b6bebe0 read and updates dev >> state outside of intf_state_mutex. >> (c) due to drain_health_wq() introduction in mlx5_pci_close() in commit >> 42ea9f1b5c625 health_wq is flushed twice. >> (d) priv->events freed is kvfree() but allocated using kzalloc(). >> >> This code certainly needs rework. >> >> In meantime to avoid this regression, I believe below hunk eliminates error >> introduced in patch 41798df9bfc. >> Will you please help test it? >> >> Shay and I did the review of below patch. >> If it works I will get it through Saeed's tree and internal reviews. >> >> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c >> b/drivers/net/ethernet/mellanox/mlx5/core/main.c >> index ebec2318dbc4..529df5703737 100644 >> --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c >> +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c >> @@ -785,11 +785,6 @@ static int mlx5_pci_init(struct mlx5_core_dev *dev, >> struct pci_dev *pdev, >> >> static void mlx5_pci_close(struct mlx5_core_dev *dev) >> { >> - /* health work might still be active, and it needs pci bar in >> - * order to know the NIC state. Therefore, drain the health WQ >> - * before removing the pci bars >> - */ >> - mlx5_drain_health_wq(dev); >> iounmap(dev->iseg); >> pci_clear_master(dev->pdev); >> release_bar(dev->pdev); >> @@ -1235,6 +1230,7 @@ void mlx5_unload_one(struct mlx5_core_dev *dev, bool >> cleanup) >> if (cleanup) { >> mlx5_unregister_device(dev); >> mlx5_devlink_unregister(priv_to_devlink(dev)); >> + mlx5_drain_health_wq(dev); > I think with the above you can remove the mlx5_drain_health_wq(dev) in > remove_one() > as that calls mlx5_unload_one() with cleanup == true. Oh yes, I was supposed to remove from there too. That was the whole point of removing from there. But I missed. :( My bad that I didn't give share patch. Please find the inline patch at the end of the email thread.
> Also I wonder if it is a problem > that the order of mlx5_devlink_unregister(priv_to_devlink(dev)) and > mlx5_unregister_device(dev) > is switched compared to the 5.7 code. That said changing both seems to result > in a deadlock > though not the "leak of a command resource": > Current order in 5.8-rc4 of unregister_device and devlink_unregister is the right one. That is, to unregister the netdevice before unregistering devlink device. Please do not change the order of mlx5_unregister_device() and mlx5_devlink_unregister(). >> > As is the patch above fixes the dereference but results in the same > completion error > as current 5.8-rc4 Below patch should hopefully fix it. It fixes the bug introduced in commit 41798df9bfca. Additionally it fixes one small change of 42ea9f1b5c62. Please remove all previous changes and apply below changes. Hopefully this should resolve. From a260b2e6a6065a57c2fa621271483cd51d0a1abf Mon Sep 17 00:00:00 2001 From: Parav Pandit <pa...@mellanox.com> Date: Thu, 9 Jul 2020 20:57:13 +0300 Subject: [PATCH] net/mlx5: Drain health wq after unregistering device When health worker is disabled before unregistering the devices, and if the firmware error is triggered while in middle of unregstering the device, it leads to very long command timeout. This error occurs because unregistering device involves several firmware commands, and health recovery worker is disabled. So device state is never updated as error. cpu-0 cpu-1 ----- ----- remove_one() mlx5_drain_health_wq() poll_health() mlx5_unregister_device() mlx5_trigger_health_work() mlx5_cmd_exec() /* health work skipped */ /* timeout */ Hence, disable the health worker after unregistering the device and before fully unloading the device, as health worker uses the device during health recovery. Both such contexts are serialized by intf_state_mutex. Change-Id: I054df865726af42ea940300b439889e58d0956d1 Fixes: 41798df9bfca ("net/mlx5: Drain wq first during PCI device removal") Fixes: 42ea9f1b5c62 ("net/mlx5: drain health workqueue in case of driver load error") Reported-by: Niklas Schnelle <schne...@linux.ibm.com> Signed-off-by: Shay Drory <sh...@mellanox.com> Signed-off-by: Parav Pandit <pa...@mellanox.com> --- drivers/net/ethernet/mellanox/mlx5/core/main.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index e32d46c33701..100a5c31ee0b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -785,11 +785,6 @@ static int mlx5_pci_init(struct mlx5_core_dev *dev, struct pci_dev *pdev, static void mlx5_pci_close(struct mlx5_core_dev *dev) { - /* health work might still be active, and it needs pci bar in - * order to know the NIC state. Therefore, drain the health WQ - * before removing the pci bars - */ - mlx5_drain_health_wq(dev); iounmap(dev->iseg); pci_clear_master(dev->pdev); release_bar(dev->pdev); @@ -1230,6 +1225,7 @@ void mlx5_unload_one(struct mlx5_core_dev *dev, bool cleanup) if (cleanup) { mlx5_unregister_device(dev); mlx5_devlink_unregister(priv_to_devlink(dev)); + mlx5_drain_health_wq(dev); } else { mlx5_detach_device(dev); } @@ -1361,6 +1357,11 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *id) return 0; err_load_one: + /* health work might still be active, and it needs pci bar in + * order to know the NIC state. Therefore, drain the health WQ + * before removing the pci bars + */ + mlx5_drain_health_wq(dev); mlx5_pci_close(dev); pci_init_err: mlx5_mdev_uninit(dev); @@ -1377,7 +1378,6 @@ static void remove_one(struct pci_dev *pdev) devlink_reload_disable(devlink); mlx5_crdump_disable(dev); - mlx5_drain_health_wq(dev); mlx5_unload_one(dev, true); mlx5_pci_close(dev); mlx5_mdev_uninit(dev); -- 2.26.2