From: Francesco Ruggeri <frugg...@arista.com>

We observed two panics involving races with igb_reset_task.
The first panic is caused by this race condition:

        kworker                 reboot -f

        igb_reset_task
        igb_reinit_locked
        igb_down
        napi_synchronize
                                __igb_shutdown
                                igb_clear_interrupt_scheme
                                igb_free_q_vectors
                                igb_free_q_vector
                                adapter->q_vector[v_idx] = NULL;
        napi_disable
        Panics trying to access
        adapter->q_vector[v_idx].napi_state

The second panic (a divide error) is caused by this race:

kworker         reboot -f       tx packet

igb_reset_task
                __igb_shutdown
                rtnl_lock()
                ...
                igb_clear_interrupt_scheme
                igb_free_q_vectors
                adapter->num_tx_queues = 0
                ...
                rtnl_unlock()
rtnl_lock()
igb_reinit_locked
igb_down
igb_up
netif_tx_start_all_queues
                                dev_hard_start_xmit
                                igb_xmit_frame
                                igb_tx_queue_mapping
                                Panics on
                                r_idx % adapter->num_tx_queues

This commit applies to igb_reset_task the same changes that
were applied to ixgbe in commit 2f90b8657ec9 ("ixgbe: this patch
adds support for DCB to the kernel and ixgbe driver"),
commit 8f4c5c9fb87a ("ixgbe: reinit_locked() should be called with
rtnl_lock") and commit 88adce4ea8f9 ("ixgbe: fix possible race in
reset subtask").

Signed-off-by: Francesco Ruggeri <frugg...@arista.com>
Tested-by: Aaron Brown <aaron.f.br...@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.ngu...@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c 
b/drivers/net/ethernet/intel/igb/igb_main.c
index 8bb3db2cbd41..6e5861bfb0fa 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -6224,9 +6224,18 @@ static void igb_reset_task(struct work_struct *work)
        struct igb_adapter *adapter;
        adapter = container_of(work, struct igb_adapter, reset_task);
 
+       rtnl_lock();
+       /* If we're already down or resetting, just bail */
+       if (test_bit(__IGB_DOWN, &adapter->state) ||
+           test_bit(__IGB_RESETTING, &adapter->state)) {
+               rtnl_unlock();
+               return;
+       }
+
        igb_dump(adapter);
        netdev_err(adapter->netdev, "Reset adapter\n");
        igb_reinit_locked(adapter);
+       rtnl_unlock();
 }
 
 /**
-- 
2.26.2

Reply via email to