Hi Peter,
On 2025-09-19 13:50, Peter Xu wrote:
> On Fri, Sep 19, 2025 at 12:58:08PM -0400, Peter Xu wrote:
> > > @@ -2564,6 +2569,11 @@ static void *source_return_path_thread(void
> > > *opaque)
> > > tmp32 = ldl_be_p(buf);
> > > trace_source_return_path_thread_pong(tmp32);
> > > qemu_sem_post(&ms->rp_state.rp_pong_acks);
> > > + if (tmp32 == QEMU_VM_PING_PACKAGED_LOADED) {
> > > + trace_source_return_path_thread_dst_started();
> > > + migrate_set_state(&ms->state,
> > > MIGRATION_STATUS_POSTCOPY_DEVICE,
> > > + MIGRATION_STATUS_POSTCOPY_ACTIVE);
> >
> > Could this race with the migration thread modifying the state concurrently?
> >
> > To avoid it, we could have a bool, set it here once, and in the iterations
> > do something like:
> >
> > diff --git a/migration/migration.c b/migration/migration.c
> > index 10c216d25d..55230e10ee 100644
> > --- a/migration/migration.c
> > +++ b/migration/migration.c
> > @@ -3449,6 +3449,16 @@ static MigIterateState
> > migration_iteration_run(MigrationState *s)
> > trace_migrate_pending_estimate(pending_size, must_precopy,
> > can_postcopy);
> >
> > if (in_postcopy) {
> > + if (s->postcopy_package_loaded) {
> > + assert(s->state == MIGRATION_STATUS_POSTCOPY_DEVICE);
> > + migrate_set_state(s->state, MIGRATION_STATUS_POSTCOPY_DEVICE,
> > + MIGRATION_STATUS_POSTCOPY_ACTIVE);
> > + /*
> > + * Since postcopy cannot be re-initiated, this flag will only
> > + * be set at most once for QEMU's whole lifecyce.
> > + */
> > + s->postcopy_package_loaded = false;
> > + }
> > /*
> > * Iterate in postcopy until all pending data flushed. Note that
> > * postcopy completion doesn't rely on can_switchover, because when
It was there in the RFC version, when there was mutual handshake before
dst starting, but I thought cmp&exchange would be enough. I can add it
again, however, there is no need to set it to false afterwards, we can
simply check if this condition is true:
s->postcopy_package_loaded && s->state == MIGRATION_STATUS_POSTCOPY_DEVICE.
>
> [...]
>
> > > @@ -2871,7 +2882,7 @@ static int postcopy_start(MigrationState *ms, Error
> > > **errp)
> > >
> > > /* Now, switchover looks all fine, switching to postcopy-active */
> > > migrate_set_state(&ms->state, MIGRATION_STATUS_DEVICE,
> > > - MIGRATION_STATUS_POSTCOPY_ACTIVE);
> > > + MIGRATION_STATUS_POSTCOPY_DEVICE);
> > >
> > > bql_unlock();
> > >
> > > @@ -3035,7 +3046,8 @@ static void migration_completion(MigrationState *s)
> > >
> > > if (s->state == MIGRATION_STATUS_ACTIVE) {
> > > ret = migration_completion_precopy(s);
> > > - } else if (s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
> > > + } else if (s->state == MIGRATION_STATUS_POSTCOPY_DEVICE ||
> > > + s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
> >
> > Exactly. We need to be prepared that src sending too fast so when device
> > loading on dest we finished.
>
> One thing more to mention here.. which may void some previous comments I
> left. Let's discuss.
>
> I think it may also make sense to only allow a COMPLETE after
> POSTCOPY_ACTIVE.
>
> That is, if src sends too fast to have finished sending everything,
> reaching COMPLETE during POSTCOPY_DEVICE, that is, while before it receives
> the new PONG you defined, then.. I _think_ it is better to wait for that.
>
> If it finally arrives, then it's perfect, we switch to POSTCOPY_ACTIVE,
> then continue the completion.
>
> If the channel is broken before its arrival, logically we should handle
> this case as a FAILURE and restart the VM on src.
>
> It's only relevant in a corner case, but does that sound better?
Yes, it does make sense to wait for POSTCOPY_ACTIVE as src could finish
before dst finished package load and could still fail. We could use a
qemu_event_wait() to wait and set the event in src return path thread
when the PONG is received.
>
> --
> Peter Xu
>