On Sun, 2017-02-26 at 09:34 -0800, Eric Dumazet wrote: > I do not believe this bug is mlx4 specific. > > Anything doing the following while hard irq were not masked : > > local_bh_disable(); > napi_reschedule(&priv->rx_cq[ring]->napi); > local_bh_enable(); > > Like in mlx4_en_recover_from_oom() > > Can trigger the issue really. > > Unfortunately I do not see how core layer can handle this. > Only the driver hard irq could possibly know that it could not grab > NAPI_STATE_SCHED
Actually we could use an additional bit for that, that the driver would set even if NAPI_STATE_SCHED could not be grabbed. Let me try something.