On Tue, Feb 22, 2022 at 10:57:34AM +0000, Dr. David Alan Gilbert wrote:
> * Peter Xu ([email protected]) wrote:
> > We used to use postcopy_try_recover() to replace migration_incoming_setup()
> > to
> > setup incoming channels. That's fine for the old world, but in the new
> > world
> > there can be more than one channels that need setup. Better move the
> > channel
> > setup out of it so that postcopy_try_recover() only handles the last phase
> > of
> > switching to the recovery phase.
> >
> > To do that in migration_fd_process_incoming(), move the
> > postcopy_try_recover()
> > call to be after migration_incoming_setup(), which will setup the channels.
> > While in migration_ioc_process_incoming(), postpone the recover() routine
> > right
> > before we'll jump into migration_incoming_process().
> >
> > A side benefit is we don't need to pass in QEMUFile* to
> > postcopy_try_recover()
> > anymore. Remove it.
> >
> > Signed-off-by: Peter Xu <[email protected]>
>
> OK, but note one question below:
>
> Reviewed-by: Dr. David Alan Gilbert <[email protected]>
Thanks.
>
> > ---
> > migration/migration.c | 23 +++++++++++------------
> > 1 file changed, 11 insertions(+), 12 deletions(-)
> >
> > diff --git a/migration/migration.c b/migration/migration.c
> > index 67520d3105..b2e6446457 100644
> > --- a/migration/migration.c
> > +++ b/migration/migration.c
> > @@ -665,19 +665,20 @@ void migration_incoming_process(void)
> > }
> >
> > /* Returns true if recovered from a paused migration, otherwise false */
> > -static bool postcopy_try_recover(QEMUFile *f)
> > +static bool postcopy_try_recover(void)
> > {
> > MigrationIncomingState *mis = migration_incoming_get_current();
> >
> > if (mis->state == MIGRATION_STATUS_POSTCOPY_PAUSED) {
> > /* Resumed from a paused postcopy migration */
> >
> > - mis->from_src_file = f;
> > + /* This should be set already in migration_incoming_setup() */
> > + assert(mis->from_src_file);
> > /* Postcopy has standalone thread to do vm load */
> > - qemu_file_set_blocking(f, true);
> > + qemu_file_set_blocking(mis->from_src_file, true);
>
> Does that set_blocking happen on the 2nd channel somewhere?
Nop. I think the rational is that by default all channels are blocking.
Then what happened is: migration code only sets the main channel to
non-blocking on incoming, that's in migration_incoming_setup(). Hence for
postcopy recovery we need to tweak it to blocking here.
The 2nd new channel is not operated by migration_incoming_setup(), but by
postcopy_preempt_new_channel(), so it keeps the original blocking state,
which should be blocking.
If we want to make that clear, we can proactively set non-blocking too in
postcopy_preempt_new_channel() on the 2nd channel. It's just that it
should be optional as long as blocking is the default for any new fd of a
socket.
Thanks,
--
Peter Xu