On Sat, Jul 05, 2025 at 01:23:27PM -0400, Joel Fernandes wrote: > Recently while revising RCU's cpu online checks, there was some discussion > around how IPIs synchronize with hotplug. > > Add comments explaining how preemption disable creates mutual exclusion with > CPU hotplug's stop_machine mechanism. The key insight is that stop_machine() > atomically updates CPU masks and flushes IPIs with interrupts disabled, and > cannot proceed while any CPU (including the IPI sender) has preemption > disabled. > > Cc: Andrea Righi <[email protected]> > Cc: Paul E. McKenney <[email protected]> > Cc: Frederic Weisbecker <[email protected]> > Cc: [email protected] > Co-developed-by: Frederic Weisbecker <[email protected]> > Signed-off-by: Joel Fernandes <[email protected]>
Acked-by: Paul E. McKenney <[email protected]> > --- > v1->v2: Reworded a bit more (minor nit). > > kernel/cpu.c | 4 ++++ > kernel/smp.c | 12 ++++++++++++ > 2 files changed, 16 insertions(+) > > diff --git a/kernel/cpu.c b/kernel/cpu.c > index a59e009e0be4..a8ce1395dd2c 100644 > --- a/kernel/cpu.c > +++ b/kernel/cpu.c > @@ -1310,6 +1310,10 @@ static int takedown_cpu(unsigned int cpu) > > /* > * So now all preempt/rcu users must observe !cpu_active(). > + * > + * stop_machine() waits for all CPUs to enable preemption. This lets > + * take_cpu_down() atomically update CPU masks and flush last IPI > + * before new IPIs can be attempted to be sent. > */ > err = stop_machine_cpuslocked(take_cpu_down, NULL, cpumask_of(cpu)); > if (err) { > diff --git a/kernel/smp.c b/kernel/smp.c > index 974f3a3962e8..842691467f9e 100644 > --- a/kernel/smp.c > +++ b/kernel/smp.c > @@ -93,6 +93,9 @@ int smpcfd_dying_cpu(unsigned int cpu) > * explicitly (without waiting for the IPIs to arrive), to > * ensure that the outgoing CPU doesn't go offline with work > * still pending. > + * > + * This runs in stop_machine's atomic context with interrupts disabled, > + * thus CPU offlining and IPI flushing happen together atomically. > */ > __flush_smp_call_function_queue(false); > irq_work_run(); > @@ -418,6 +421,10 @@ void __smp_call_single_queue(int cpu, struct llist_node > *node) > */ > static int generic_exec_single(int cpu, call_single_data_t *csd) > { > + /* > + * Preemption must be disabled by caller to mutually exclude with > + * stop_machine() atomically updating CPU masks and flushing IPIs. > + */ > if (cpu == smp_processor_id()) { > smp_call_func_t func = csd->func; > void *info = csd->info; > @@ -640,6 +647,11 @@ int smp_call_function_single(int cpu, smp_call_func_t > func, void *info, > /* > * prevent preemption and reschedule on another processor, > * as well as CPU removal > + * > + * get_cpu() disables preemption, ensuring mutual exclusion with > + * stop_machine() where CPU offlining and last IPI flushing happen > + * atomically versus this code. This guarantees here that the > cpu_online() > + * check and IPI sending are safe without losing IPIs due to offlining. > */ > this_cpu = get_cpu(); > > -- > 2.43.0 >

