On 2025-09-22 12:16, Peter Xu wrote:
> On Mon, Sep 22, 2025 at 03:34:19PM +0200, Juraj Marcin wrote:
> > Hi Peter,
> > 
> > On 2025-09-19 13:50, Peter Xu wrote:
> > > On Fri, Sep 19, 2025 at 12:58:08PM -0400, Peter Xu wrote:
> > > > > @@ -2564,6 +2569,11 @@ static void *source_return_path_thread(void 
> > > > > *opaque)
> > > > >              tmp32 = ldl_be_p(buf);
> > > > >              trace_source_return_path_thread_pong(tmp32);
> > > > >              qemu_sem_post(&ms->rp_state.rp_pong_acks);
> > > > > +            if (tmp32 == QEMU_VM_PING_PACKAGED_LOADED) {
> > > > > +                trace_source_return_path_thread_dst_started();
> > > > > +                migrate_set_state(&ms->state, 
> > > > > MIGRATION_STATUS_POSTCOPY_DEVICE,
> > > > > +                                  MIGRATION_STATUS_POSTCOPY_ACTIVE);
> > > > 
> > > > Could this race with the migration thread modifying the state 
> > > > concurrently?
> > > > 
> > > > To avoid it, we could have a bool, set it here once, and in the 
> > > > iterations
> > > > do something like:
> > > > 
> > > > diff --git a/migration/migration.c b/migration/migration.c
> > > > index 10c216d25d..55230e10ee 100644
> > > > --- a/migration/migration.c
> > > > +++ b/migration/migration.c
> > > > @@ -3449,6 +3449,16 @@ static MigIterateState 
> > > > migration_iteration_run(MigrationState *s)
> > > >      trace_migrate_pending_estimate(pending_size, must_precopy, 
> > > > can_postcopy);
> > > >  
> > > >      if (in_postcopy) {
> > > > +        if (s->postcopy_package_loaded) {
> > > > +            assert(s->state == MIGRATION_STATUS_POSTCOPY_DEVICE);
> > > > +            migrate_set_state(s->state, 
> > > > MIGRATION_STATUS_POSTCOPY_DEVICE,
> > > > +                              MIGRATION_STATUS_POSTCOPY_ACTIVE);
> > > > +            /*
> > > > +             * Since postcopy cannot be re-initiated, this flag will 
> > > > only
> > > > +             * be set at most once for QEMU's whole lifecyce.
> > > > +             */
> > > > +            s->postcopy_package_loaded = false;
> > > > +        }
> > > >          /*
> > > >           * Iterate in postcopy until all pending data flushed.  Note 
> > > > that
> > > >           * postcopy completion doesn't rely on can_switchover, because 
> > > > when
> > 
> > It was there in the RFC version, when there was mutual handshake before
> > dst starting, but I thought cmp&exchange would be enough. I can add it
> > again, however, there is no need to set it to false afterwards, we can
> > simply check if this condition is true:
> > s->postcopy_package_loaded && s->state == MIGRATION_STATUS_POSTCOPY_DEVICE.
> 
> Setting it to false was a safety measure.  Indeed not needed, but when so
> we need to be extremely careful to always check with above two conditions
> to avoid it frequently triggers.  So I thought clearing it would be much
> easier to read.  I can wait and read the new version whatever you prefer.
> 
> > 
> > > 
> > > [...]
> > > 
> > > > > @@ -2871,7 +2882,7 @@ static int postcopy_start(MigrationState *ms, 
> > > > > Error **errp)
> > > > >  
> > > > >      /* Now, switchover looks all fine, switching to postcopy-active 
> > > > > */
> > > > >      migrate_set_state(&ms->state, MIGRATION_STATUS_DEVICE,
> > > > > -                      MIGRATION_STATUS_POSTCOPY_ACTIVE);
> > > > > +                      MIGRATION_STATUS_POSTCOPY_DEVICE);
> > > > >  
> > > > >      bql_unlock();
> > > > >  
> > > > > @@ -3035,7 +3046,8 @@ static void migration_completion(MigrationState 
> > > > > *s)
> > > > >  
> > > > >      if (s->state == MIGRATION_STATUS_ACTIVE) {
> > > > >          ret = migration_completion_precopy(s);
> > > > > -    } else if (s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
> > > > > +    } else if (s->state == MIGRATION_STATUS_POSTCOPY_DEVICE ||
> > > > > +               s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
> > > > 
> > > > Exactly.  We need to be prepared that src sending too fast so when 
> > > > device
> > > > loading on dest we finished.
> > > 
> > > One thing more to mention here.. which may void some previous comments I
> > > left.  Let's discuss.
> > > 
> > > I think it may also make sense to only allow a COMPLETE after
> > > POSTCOPY_ACTIVE.
> > > 
> > > That is, if src sends too fast to have finished sending everything,
> > > reaching COMPLETE during POSTCOPY_DEVICE, that is, while before it 
> > > receives
> > > the new PONG you defined, then.. I _think_ it is better to wait for that.
> > > 
> > > If it finally arrives, then it's perfect, we switch to POSTCOPY_ACTIVE,
> > > then continue the completion.
> > > 
> > > If the channel is broken before its arrival, logically we should handle
> > > this case as a FAILURE and restart the VM on src.
> > > 
> > > It's only relevant in a corner case, but does that sound better?
> > 
> > Yes, it does make sense to wait for POSTCOPY_ACTIVE as src could finish
> > before dst finished package load and could still fail. We could use a
> > qemu_event_wait() to wait and set the event in src return path thread
> > when the PONG is received.
> 
> Right.
> 
> Though since we want to move to POSTCOPY_ACTIVE asap when receiving the
> corresponding PONG, we may want to have something like qemu_event_read()
> just return "ev->value == EV_SET", as we can't event_wait() in the
> migration thread while pushing RAMs.
> 
> Or, to avoid touching library code, we can also introduce yet another bool,
> then when receiving the PONG we set both (1) the bool, and (2) the event.
> We can check the bool in the iterations (when set, wait() to consume the
> event; the event should have happened or just to happen), and wait() on the
> event when completing (when return, bool must have been set, hence can
> reset bool).

Yes, my initial idea was the latter with two new attributes a bool and
an event, definitely not waiting while there are still pages to send.
But I can also evaluate the first one before sending another version.

Thanks!

> 
> -- 
> Peter Xu
> 


Reply via email to