Hi Paul, On 7/6/2025 1:18 PM, Paul E. McKenney wrote: > On Sat, Jul 05, 2025 at 04:39:16PM -0400, Joel Fernandes wrote: >> Extract the complex expedited handling condition in rcu_read_unlock_special() >> into a separate function rcu_unlock_needs_exp_handling() with detailed >> comments explaining each condition. >> >> This improves code readability. No functional change intended. > > Very nice!!!
Thanks! > > Some questions and comments interspersed below. I replied inline below: > > Thanx, Paul > >> Signed-off-by: Joel Fernandes <[email protected]> >> --- >> kernel/rcu/tree_plugin.h | 80 +++++++++++++++++++++++++++++++++++----- >> 1 file changed, 71 insertions(+), 9 deletions(-) >> >> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h >> index baf57745b42f..8504d95bb35b 100644 >> --- a/kernel/rcu/tree_plugin.h >> +++ b/kernel/rcu/tree_plugin.h >> @@ -647,6 +647,72 @@ static void rcu_preempt_deferred_qs_handler(struct >> irq_work *iwp) >> local_irq_restore(flags); >> } >> >> +/* >> + * Check if expedited grace period processing during unlock is needed. >> + * >> + * This function determines whether expedited handling is required based on: >> + * 1. Task blocking an expedited grace period > > This is a heuristic. What we are actually checking is whether the task > is blocking *some* grace period and whether at least one task (maybe > this one, maybe not) is blocking an expedited grace period. Makes sense, I changed this to: * 1. Task blocking an expedited grace period (based on a heuristic, could be * false-positive, see below.) And the below comment to: /* * Check if this task is blocking an expedited grace period. If the * task was preempted within an RCU read-side critical section and is * on the expedited grace period blockers list (exp_tasks), we need * expedited handling to unblock the expedited GP. This is not an exact * check because 't' might not be on the exp_tasks list at all - its * just a fast heuristic that can be false-positive sometimes. */ if (t->rcu_blocked_node && READ_ONCE(t->rcu_blocked_node->exp_tasks)) return true; Hope that looks Ok. > > Why not an exact check? Because that would mean traversing the list > starting at ->exp_tasks, and that list could potentially contain every > task in the system. And I have received bug reports encountered on > systems with hundreds of thousands of tasks. Got it. > > I could imagine a more complex data structure that semi-efficiently > tracked exact information, but I could also imagine this not being worth > the effort. > >> + * 2. CPU participating in an expedited grace period >> + * 3. Strict grace period mode requiring expedited handling >> + * 4. RCU priority boosting needs when interrupts were disabled > > s/boosting/deboosting/ > Fixed, thanks. > >> + */ >> + if (t->rcu_blocked_node && READ_ONCE(t->rcu_blocked_node->exp_tasks)) >> + return true; >> + >> + /* >> + * Check if this CPU is participating in an expedited grace period. >> + * The expmask bitmap tracks which CPUs need to check in for the >> + * current expedited GP. If our CPU's bit is set, we need expedited >> + * handling to help complete the expedited GP. >> + */ >> + if (rdp->grpmask & READ_ONCE(rnp->expmask)) >> + return true; >> + >> + /* >> + * In CONFIG_RCU_STRICT_GRACE_PERIOD=y kernels, all grace periods >> + * are treated as short for testing purposes even if that means >> + * disturbing the system more. Check if either: >> + * - This CPU has not yet reported a quiescent state, or >> + * - This task was preempted within an RCU critical section >> + * In either case, requird expedited handling for strict GP mode. > > s/requird/required/ ;-) I meant "require" :-D. Will fix. > >> + */ >> + if (IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD) && >> + ((rdp->grpmask & READ_ONCE(rnp->qsmask)) || t->rcu_blocked_node)) >> + return true; >> + >> + /* >> + * RCU priority boosting case: If a task is subject to RCU priority >> + * boosting and exits an RCU read-side critical section with interrupts >> + * disabled, we need expedited handling to ensure timely deboosting. >> + * Without this, a low-priority task could incorrectly run at high >> + * real-time priority for an extended period effecting real-time > > s/effecting/degrading/ to be more precise. > Fixed, thanks. - Joel

