On Fri, 26 Feb 2021 17:35:21 -0800 Wei Wang wrote: > On Fri, Feb 26, 2021 at 5:22 PM Jakub Kicinski <k...@kernel.org> wrote: > > > > On Fri, 26 Feb 2021 17:02:17 -0800 Wei Wang wrote: > > > static int napi_thread_wait(struct napi_struct *napi) > > > { > > > + bool woken = false; > > > + > > > set_current_state(TASK_INTERRUPTIBLE); > > > > > > while (!kthread_should_stop() && !napi_disable_pending(napi)) { > > > - if (test_bit(NAPI_STATE_SCHED, &napi->state)) { > > > + unsigned long state = READ_ONCE(napi->state); > > > + > > > + if ((state & NAPIF_STATE_SCHED) && > > > + ((state & NAPIF_STATE_SCHED_THREAD) || woken)) { > > > WARN_ON(!list_empty(&napi->poll_list)); > > > __set_current_state(TASK_RUNNING); > > > return 0; > > > + } else { > > > + WARN_ON(woken); > > > } > > > > > > schedule(); > > > + woken = true; > > > set_current_state(TASK_INTERRUPTIBLE); > > > } > > > __set_current_state(TASK_RUNNING); > > > > > > I don't think it is sufficient to only set SCHED_THREADED bit when the > > > thread is in RUNNING state. > > > In fact, the thread is most likely NOT in RUNNING mode before we call > > > wake_up_process() in ____napi_schedule(), because it has finished the > > > previous round of napi->poll() and SCHED bit was cleared, so > > > napi_thread_wait() sets the state to INTERRUPTIBLE and schedule() call > > > should already put it in sleep. > > > > That's why the check says "|| woken": > > > > ((state & NAPIF_STATE_SCHED_THREAD) || woken)) > > > > thread knows it owns the NAPI if: > > > > (a) the NAPI has the explicit flag set > > or > > (b) it was just worken up and !kthread_should_stop(), since only > > someone who just claimed the normal SCHED on thread's behalf > > will wake it up > > The 'woken' is set after schedule(). If it is the first time > napi_threaded_wait() is called, and SCHED_THREADED is not set, and > woken is not set either, this thread will be put to sleep when it > reaches schedule(), even though there is work waiting to be done on > that napi. And I think this kthread will not be woken up again > afterwards, since the SCHED bit is already grabbed.
Indeed, looks like the task will be in WAKING state until it runs? We can switch the check in ____napi_schedule() from if (thread->state == TASK_RUNNING) to if (!(thread->state & TASK_INTERRUPTIBLE)) ?