On Wed, Mar 18, 2026 at 08:51:16AM -0700, Boqun Feng wrote: > On Wed, Mar 18, 2026 at 03:43:05PM +0100, Sebastian Andrzej Siewior wrote: > [..] > > > > > way that vanilla RCU's call_rcu_core() function takes an early exit if > > > > > interrupts are disabled. Of course, vanilla RCU can rely on things > > > > > like > > > > > the scheduling-clock interrupt to start any needed grace periods [1], > > > > > but SRCU will instead need to manually defer this work, perhaps using > > > > > workqueues or IRQ work. > > > > > > > > > > In addition, rcutorture needs to be upgraded to sometimes invoke > > > > > ->call() with the scheduler pi lock held, but this change is not > > > > > fixing > > > > > a regression, so could be deferred. (There is already code in > > > > > rcutorture > > > > > that invokes the readers while holding a scheduler pi lock.) > > > > > > > > > > Given that RCU for this week through the end of March belongs to you > > > > > guys, > > > > > if one of you can get this done by end of day Thursday, London time, > > > > > very good! Otherwise, I can put something together. > > > > > > > > > > Please let me know! > > > > > > > > Given that the current locking does allow it and lockdep should have > > > > complained, I am curious if we could rule that out ;) > > > > Your patch just s/spinlock_t/raw_spinlock_t so we get the locking/ > > nesting right. The wakeup problem remains, right? > > But looking at the code, there is just srcu_funnel_gp_start(). If its > > srcu_schedule_cbs_sdp() / queue_delayed_work() usage is always delayed > > then there will be always a timer and never a direct wake up of the > > worker. Wouldn't that work? > > Late to the party, so just make sure I understand the problem. The > problem is the wakeup in call_srcu() when it's called with scheduler > lock held, right? If so I think the current code works as what you > already explain, we defer the wakeup into a workqueue.
The issue is that call_rcu_tasks() (which is call_srcu() now) is also invoked with a scheduler pi/rq lock held, which results in a deadlock cycle. So the srcu_gp_start_if_needed() function's call to raw_spin_lock_irqsave_sdp_contention() must be deferred to the workqueue handler, not just the wake-up. And that in turn means that the callback point also needs to be passed to this handler. See this email thread: https://lore.kernel.org/all/cap01t75ekpvw+95nqnwg9p-1+kzvzojpn0nlat+28sf1b9w...@mail.gmail.com/ > (but Paul, we are not talking about calling call_srcu(), that requires > some more work to get it work) Agreed, splitting srcu_gp_start_if_needed() and using a workqueue if interrupts were already disabled on entry. Otherwise, directly invoking the split-out portion of srcu_gp_start_if_needed(). But we might be talking past each other. Thanx, Paul > > > It would be nice, but your point about needing to worry about spinlocks > > > is compelling. > > > > > > But couldn't lockdep scan the current task's list of held locks and see > > > whether only raw spinlocks are held (including when no spinlocks of any > > > type are held), and complain in that case? Or would that scanning be > > > too high of overhead? (But we need that scan anyway to check deadlock, > > > don't we?) > > > > PeterZ didn't like it and the nesting thing identified most of the > > problem cases. It should also catch _this_ one. > > > > Thinking about it further, you don't need to worry about > > local_bh_disable() but RCU will becomes another corner case. You would > > have to exclude "rcu_read_lock(); spin_lock();" on a !preempt kernel > > which would otherwise lead to false positives. > > But as I said, this case as explained is a nesting problem and should be > > reported by lockdep with its current features. > > Right, otherwise there is a lockdep bug ;-) > > Regards, > Boqun
