On Thu, Aug 07, 2025 at 10:41:17AM +0800, [email protected] wrote:
> diff --git a/migration/multifd.c b/migration/multifd.c
> index b255778855..aca0aeb341 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -1228,6 +1228,16 @@ void multifd_recv_sync_main(void)
> }
> }
> trace_multifd_recv_sync_main_signal(p->id);
> + do {
> + if (qemu_sem_timedwait(&multifd_recv_state->sem_sync, 10000) ==
> 0) {
> + break;
> + }
> + if (qemu_in_coroutine()) {
> + aio_co_schedule(qemu_get_current_aio_context(),
> + qemu_coroutine_self());
> + qemu_coroutine_yield();
> + }
> + } while (1);
I still think either yank or fixing migrate_cancel is the way to go, but
when staring at this change.. I don't think I understand this patch at all.
It timedwait()s on the sem_sync that we just consumed. Do you at least
need to remove the ones above this piece of code to not hang forever?
for (i = 0; i < thread_count; i++) {
trace_multifd_recv_sync_main_wait(i);
qemu_sem_wait(&multifd_recv_state->sem_sync);
}
> qemu_sem_post(&p->sem_sync);
> }
> trace_multifd_recv_sync_main(multifd_recv_state->packet_num);
> --
> 2.27.0
>
--
Peter Xu