From: Eric Dumazet <eric.duma...@gmail.com> Date: Mon, 27 Feb 2017 08:44:14 -0800
> Any point doing a napi_schedule() not from device hard irq handler > is subject to the race for NIC using some kind of edge trigger > interrupts. > > Since we do not provide a ndo to disable device interrupts, the > following can happen. Ok, now I understand. I think even without considering the race you are trying to solve, this situation is really dangerous. I am sure that every ->poll() handler out there was written by an author who completely assumed that if they are executing then the device's interrupts for that NAPI instance are disabled. And this is with very few, if any, exceptions. So if we saw a driver doing something like: reg->irq_enable ^= value; after napi_complete_done(), it would be quite understandable. We really made a mistake taking the napi_schedule() call out of the domain of the driver so that it could manage the interrupt state properly. I'm not against your missed bit fix as a short-term cure for now, it's just that somewhere down the road we need to manage the interrupt properly.