Hello!

Kumar Kartikeya Dwivedi (CCed) privately reported a bug in
my implementation of the RCU Tasks Trace API in terms of SRCU-fast.
You see, I forgot to ask what contexts call_rcu_tasks_trace() is called
from, and it turns out that it can in fact be called with the scheduler
pi/rq locks held.  This results in a deadlock when SRCU-fast invokes the
scheduler in order to start the SRCU-fast grace period.  So RCU needs
a fix to my fix found here:

b540c63cf6e5 ("srcu: Use raw spinlocks so call_srcu() can be used under 
preempt_disable()")

Sebastian, the PREEMPT_RT aspect is that lockdep does not complain
about acquisition of non-raw spinlocks from preemption-disabled regions
of code.  This might be intentional, for example, there might be large
bodies of Linux-kernel code that frequently acquire non-raw spinlocks
from preemption-disabled regions of code, but which are never part of
PREEMPT_RT kernels.  Otherwise, it might be good for lockdep to diagnose
this sort of thing.

Back to the actual bug, that call_srcu() now needs to tolerate being called
with scheduler rq/pi locks held...

The straightforward (but perhaps broken) way to resolve this is to make
srcu_gp_start_if_needed() defer invoking the scheduler, similar to the
way that vanilla RCU's call_rcu_core() function takes an early exit if
interrupts are disabled.  Of course, vanilla RCU can rely on things like
the scheduling-clock interrupt to start any needed grace periods [1],
but SRCU will instead need to manually defer this work, perhaps using
workqueues or IRQ work.

In addition, rcutorture needs to be upgraded to sometimes invoke
->call() with the scheduler pi lock held, but this change is not fixing
a regression, so could be deferred.  (There is already code in rcutorture
that invokes the readers while holding a scheduler pi lock.)

Given that RCU for this week through the end of March belongs to you guys,
if one of you can get this done by end of day Thursday, London time,
very good!  Otherwise, I can put something together.

Please let me know!

                                                Thanx, Paul [2]

[1] The exceptions to this rule being handled by the call to
    invoke_rcu_core() when rcu_is_watching() returns false.

[2] Ah, and should vanilla RCU's call_rcu() be invokable from NMI
    handlers?  Or should there be a call_rcu_nmi() for this purpose?
    Or should we continue to have its callers check in_nmi() when needed?

Reply via email to