On Thu, Aug 20, 2020 at 05:14:31PM -0700, Jakub Kicinski wrote: > On Fri, 21 Aug 2020 02:04:39 +0300 Vladimir Oltean wrote: > > On Thu, Jul 30, 2020 at 05:30:59PM -0700, Jakub Kicinski wrote: > > > On Thu, 30 Jul 2020 18:53:58 -0400 David Thompson wrote: > > > > + > > > > + /* Tell networking subsystem to poll GigE driver */ > > > > + napi_schedule(&priv->napi); > > > > > > _irqoff > > > > Hmm, I wouldn't be so sure about this particular advice. With > > PREEMPT_RT, interrupt handlers are force-threaded and run in process > > context, therefore with hardirqs enabled. This driver doesn't call > > request_irq with IRQF_NO_THREAD, so calling napi_schedule_irqoff would > > create a bug that is very, very difficult to find. > > Doesn't PREEMPT_RT take a local_lock or some form thereof around the > irq threads then? If it doesn't then we probably need one around NAPI. > > Regardless even if that's the case this is an existing issue, and not > something that changes how the driver API would look.
So the thread function is surrounded by local_bh_disable: /* * Interrupts which are not explicitly requested as threaded * interrupts rely on the implicit bh/preempt disable of the hard irq * context. So we need to disable bh here to avoid deadlocks and other * side effects. */ static irqreturn_t irq_forced_thread_fn(struct irq_desc *desc, struct irqaction *action) { irqreturn_t ret; local_bh_disable(); ret = action->thread_fn(action->irq, action->dev_id); if (ret == IRQ_HANDLED) atomic_inc(&desc->threads_handled); irq_finalize_oneshot(desc, action); local_bh_enable(); return ret; } but that doesn't really help in the case of napi_schedule_irqoff. You see, one of these 2 functions ends up being called (from napi_schedule or from napi_schedule_irqoff): void raise_softirq(unsigned int nr) { unsigned long flags; local_irq_save(flags); raise_softirq_irqoff(nr); local_irq_restore(flags); } void __raise_softirq_irqoff(unsigned int nr) { trace_softirq_raise(nr); or_softirq_pending(1UL << nr); } And the "or_softirq_pending" function is not hardirq-safe, since it doesn't use atomic operations, that's the whole problem right there. It really wants to be called under local_irq_save. #define or_softirq_pending(x) (__this_cpu_or(local_softirq_pending_ref, (x))) So the only real (safe) use case for napi_schedule_irqoff is if you were already inside an atomic section at the caller site (local_irq_save). Otherwise, it's best to just use napi_schedule. By the way, I tested on a 10G link and there wasn't any performance impact on non-RT to speak of. This is because hardirqs are already disabled, so local_irq_save() translates to only a check and no action being taken. Hope this helps, -Vladimir