* Jens Freimann ([email protected]) wrote:
> On Fri, Oct 11, 2019 at 06:11:33PM +0100, Dr. David Alan Gilbert wrote:
> > * Jens Freimann ([email protected]) wrote:
> > > This patch adds a new migration state called wait-unplug. It is entered
> > > after the SETUP state and will transition into ACTIVE once all devices
> > > were succesfully unplugged from the guest.
> > >
> > > So if a guest doesn't respond or takes long to honor the unplug request
> > > the user will see the migration state 'wait-unplug'.
> > >
> > > In the migration thread we query failover devices if they're are still
> > > pending the guest unplug. When all are unplugged the migration
> > > continues. We give it a defined number of iterations including small
> > > waiting periods before we proceed.
> > >
> > > Signed-off-by: Jens Freimann <[email protected]>
> [..]
> > > @@ -3260,6 +3271,27 @@ static void *migration_thread(void *opaque)
> > >
> > > qemu_savevm_state_setup(s->to_dst_file);
> > >
> > > + migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
> > > + MIGRATION_STATUS_WAIT_UNPLUG);
> >
> > I think I'd prefer if you only went into this state if you had any
> > devices that were going to need unplugging.
>
> Sure, that makes sense. I'll change it.
>
> > > + while (i < FAILOVER_UNPLUG_RETRIES &&
> > > + s->state == MIGRATION_STATUS_WAIT_UNPLUG) {
> > > + i++;
> > > + qemu_sem_timedwait(&s->wait_unplug_sem,
> > > FAILOVER_GUEST_UNPLUG_WAIT);
> > > + all_unplugged = qemu_savevm_state_guest_unplug_pending();
> > > + if (all_unplugged) {
> > > + break;
> > > + }
> > > + }
> > > +
> > > + if (all_unplugged) {
> > > + migrate_set_state(&s->state, MIGRATION_STATUS_WAIT_UNPLUG,
> > > + MIGRATION_STATUS_ACTIVE);
> > > + } else {
> > > + migrate_set_state(&s->state, MIGRATION_STATUS_WAIT_UNPLUG,
> > > + MIGRATION_STATUS_CANCELLING);
> > > + }
> >
> > I think you can get rid of both the timeout and the count and just make
> > sure that migrate_cancel works at this point.
>
> I see, I need to add the new state to migration_is_setup_or_active() or
> a cancel won't work.
You probably need to do that anyway given all the other places
is_setup_or_active is called.
> > This pushes the problem up a layer, which I think is fine.
>
> Seems good to me. To be clear, you're saying I should just poll on
> the device unplugged state? Like
>
> while (s->state == MIGRATION_STATUS_WAIT_UNPLUG &&
> !qemu_savevm_state_guest_unplug_pending()) {
> _ /* This block intentionally left blank */
> }
I'd keep the qemu_sem_timedwait in there, but with a short time out
(e.g. 250ms say); that way it doesn't eat cpu, but also the cancel still
happens quickly.
Dave
>
> regards,
> Jens
--
Dr. David Alan Gilbert / [email protected] / Manchester, UK