On 02/01/19 15:16, Laurent Vivier wrote:
> We can have a race condition between qemu_cpu_kick_thread() and
> qemu_kvm_cpu_thread_fn() when we hotunplug a CPU. In this case,
> qemu_cpu_kick_thread() can try to kick a thread that is exiting.
> pthread_kill() returns an error and qemu is stopped by an exit(1).
> 
>    qemu:qemu_cpu_kick_thread: No such process
> 
> We can ignore safely this error.
> 
> Signed-off-by: Laurent Vivier <lviv...@redhat.com>
> ---
>  cpus.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/cpus.c b/cpus.c
> index 0ddeeefc14..4717490bd0 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -1778,7 +1778,7 @@ static void qemu_cpu_kick_thread(CPUState *cpu)
>      }
>      cpu->thread_kicked = true;
>      err = pthread_kill(cpu->thread->thread, SIG_IPI);
> -    if (err) {
> +    if (err && err != ESRCH) {
>          fprintf(stderr, "qemu:%s: %s", __func__, strerror(err));
>          exit(1);
>      }
> 

You could in principle be sending the signal to another thread, so the
fix is a bit hackish.  However, I don't have a better idea that is not
racy. :(

The problem is that qemu_cpu_kick does not use any spinlock or mutex to
synchronize against cpu_remove_sync's qemu_thread_join.  I think once
the you reach qemu_cpu_kick in cpu_remove_sync (so if cpu->unplug) you
do not need to reset cpu->thread_kicked anymore, but I don't think
that's enough to fix it.

Paolo

Reply via email to