There are chances that napi_disable is called twice by NIC driver.
This could generate deadlock. For example,
the first napi_disable will spin until NAPI_STATE_SCHED is cleared
by napi_complete_done, then set it again.
When napi_disable is called the second time, it will loop infinitely
because no dev->poll will be running to clear NAPI_STATE_SCHED.
CPU0 CPU1
napi_disable
test_and_set_bit
(napi_complete_done clears
NAPI_STATE_SCHED, ret 0,
and set NAPI_STATE_SCHED)
napi_disable
test_and_set_bit
(ret 1 and loop infinitely because
no napi instance is scheduled to
clear NAPI_STATE_SCHED bit)
Checking the napi state bit to make sure if napi is already disabled,
exit the call early enough to avoid spinning infinitely.
Fixes: bea3348eef27 ("[NET]: Make NAPI polling independent of struct net_device
objects.")
Signed-off-by: Lijun Pan <[email protected]>
---
net/core/dev.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/net/core/dev.c b/net/core/dev.c
index 6c5967e80132..eb3c0ddd4fd7 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6809,6 +6809,24 @@ EXPORT_SYMBOL(netif_napi_add);
void napi_disable(struct napi_struct *n)
{
might_sleep();
+
+ /* make sure napi_disable() runs only once,
+ * When napi is disabled, the state bits are like:
+ * NAPI_STATE_SCHED (set by previous napi_disable)
+ * NAPI_STATE_NPSVC (set by previous napi_disable)
+ * NAPI_STATE_DISABLE (cleared by previous napi_disable)
+ * NAPI_STATE_PREFER_BUSY_POLL (cleared by previous napi_complete_done)
+ * NAPI_STATE_MISSED (cleared by previous napi_complete_done)
+ */
+
+ if (napi_disable_pending(n))
+ return;
+ if (test_bit(NAPI_STATE_SCHED, &n->state) &&
+ test_bit(NAPI_STATE_NPSVC, &n->state) &&
+ !test_bit(NAPI_STATE_MISSED, &n->state) &&
+ !test_bit(NAPI_STATE_PREFER_BUSY_POLL, &n->state))
+ return;
+
set_bit(NAPI_STATE_DISABLE, &n->state);
while (test_and_set_bit(NAPI_STATE_SCHED, &n->state))
--
2.23.0