On Wed, 25 Mar 2026 10:25:42 +0800 (CST) <[email protected]> wrote:
> >On Tue, 24 Mar 2026 15:06:16 +0800 (CST) > ><[email protected]> wrote: > > > >> From: luohaiyang10243395 <[email protected]> > >> > >> The following sequence may leads deadlock in cpu hotplug: > >> > >> CPU0 | CPU1 > >> | schedule_work_on > >> | > >> _cpu_down//set CPU1 offline | > >> cpus_write_lock | > >> | osnoise_hotplug_workfn > >> | mutex_lock(&interface_lock); > >> | cpus_read_lock(); //wait > >> cpu_hotplug_lock > >> | > >> | cpuhp/1 > >> | osnoise_cpu_die > >> | kthread_stop > >> | wait_for_completion //wait > >> osnoise/1 exit > >> | > >> | osnoise/1 > >> | osnoise_sleep > >> | mutex_lock(&interface_lock); > >> //deadlock > >> > >> Fix by swap the order of cpus_read_lock() and mutex_lock(&interface_lock). > >> > > > >So the deadlock is due to the "wait_for_completion"? > > The osnoise_cpu_init callback returns directly, which may allow another CPU > offline task to run, > the offline task holds the cpu_hotplug_lock while waiting for the osnoise > task to exit. > osnoise_hotplug_workfn may acquire interface_lock first, causing the offline > task to be blocked. > This is an ABBA deadlock. Right, as I said, it is due to the "wait_for_completion" and not due to two different locks. One is waiting for the osnoise task to exit (the "wait_for_completion") but the osnoise task is blocked on the interface_lock(). Better to show it as: task1 task2 task3 ----- ----- ----- mutex_lock(&interface_lock) [CPU GOING OFFLINE] cpus_write_lock(); osnoise_cpu_die(); kthread_stop(task3); wait_for_completion(); osnoise_sleep(); mutex_lock(&interface_lock); cpus_read_lock(); [DEAD LOCK] > > >How did you find this bug? Inspection, AI, triggered? > > > >Thanks, > > > >-- Steve > > We run autotests on kernel-6.6, report following hung task warning, and we > think the same issue exists > in linux-stable. Thanks. It's usually good to state how a bug was discovered when fixing it. Could you send a v2 with an updated change log? -- Steve
