On Wed, 25 Mar 2026 10:25:42 +0800 (CST)
<[email protected]> wrote:

> >On Tue, 24 Mar 2026 15:06:16 +0800 (CST)
> ><[email protected]> wrote:
> >  
> >> From: luohaiyang10243395 <[email protected]>
> >> 
> >> The following sequence may leads deadlock in cpu hotplug:
> >> 
> >>   CPU0                        |  CPU1
> >>                               |  schedule_work_on
> >>                               |
> >>   _cpu_down//set CPU1 offline |
> >>   cpus_write_lock             |
> >>                               |  osnoise_hotplug_workfn
> >>                               |    mutex_lock(&interface_lock);
> >>                               |    cpus_read_lock();  //wait 
> >> cpu_hotplug_lock
> >>                               |
> >>                               |  cpuhp/1
> >>                               |    osnoise_cpu_die
> >>                               |      kthread_stop
> >>                               |        wait_for_completion //wait 
> >> osnoise/1 exit
> >>                               |
> >>                               |  osnoise/1
> >>                               |    osnoise_sleep
> >>                               |      mutex_lock(&interface_lock); 
> >> //deadlock
> >> 
> >> Fix by swap the order of cpus_read_lock() and mutex_lock(&interface_lock). 
> >>  
> >
> >So the deadlock is due to the "wait_for_completion"?  
> 
> The osnoise_cpu_init callback returns directly, which may allow another CPU 
> offline task to run, 
> the offline task holds the cpu_hotplug_lock while waiting for the osnoise 
> task to exit. 
> osnoise_hotplug_workfn may acquire interface_lock first, causing the offline 
> task to be blocked. 
> This is an ABBA deadlock.

Right, as I said, it is due to the "wait_for_completion" and not due to two
different locks. One is waiting for the osnoise task to exit (the
"wait_for_completion") but the osnoise task is blocked on the interface_lock().

Better to show it as:


    task1               task2           task3
    -----               -----           -----

 mutex_lock(&interface_lock)

                    [CPU GOING OFFLINE]

                    cpus_write_lock();
                    osnoise_cpu_die();
                      kthread_stop(task3);
                        wait_for_completion();

                                      osnoise_sleep();
                                        mutex_lock(&interface_lock);

 cpus_read_lock();

 [DEAD LOCK]

> 
> >How did you find this bug? Inspection, AI, triggered?
> >
> >Thanks,
> >
> >-- Steve  
> 
> We run autotests on kernel-6.6, report following hung task warning, and we 
> think the same issue exists
> in linux-stable.

Thanks. It's usually good to state how a bug was discovered when fixing it.

Could you send a v2 with an updated change log?

-- Steve

Reply via email to