On 02/01/19 15:16, Laurent Vivier wrote: > We can have a race condition between qemu_cpu_kick_thread() and > qemu_kvm_cpu_thread_fn() when we hotunplug a CPU. In this case, > qemu_cpu_kick_thread() can try to kick a thread that is exiting. > pthread_kill() returns an error and qemu is stopped by an exit(1). > > qemu:qemu_cpu_kick_thread: No such process > > We can ignore safely this error. > > Signed-off-by: Laurent Vivier <lviv...@redhat.com> > --- > cpus.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/cpus.c b/cpus.c > index 0ddeeefc14..4717490bd0 100644 > --- a/cpus.c > +++ b/cpus.c > @@ -1778,7 +1778,7 @@ static void qemu_cpu_kick_thread(CPUState *cpu) > } > cpu->thread_kicked = true; > err = pthread_kill(cpu->thread->thread, SIG_IPI); > - if (err) { > + if (err && err != ESRCH) { > fprintf(stderr, "qemu:%s: %s", __func__, strerror(err)); > exit(1); > } >
You could in principle be sending the signal to another thread, so the fix is a bit hackish. However, I don't have a better idea that is not racy. :( The problem is that qemu_cpu_kick does not use any spinlock or mutex to synchronize against cpu_remove_sync's qemu_thread_join. I think once the you reach qemu_cpu_kick in cpu_remove_sync (so if cpu->unplug) you do not need to reset cpu->thread_kicked anymore, but I don't think that's enough to fix it. Paolo