On Thu, Jan 12, 2023 at 05:14:57PM -0500, Peter Xu wrote:
> On Thu, Jan 12, 2023 at 07:52:41PM +0000, Dr. David Alan Gilbert wrote:
> > * David Hildenbrand ([email protected]) wrote:
> > > On 12.01.23 18:56, Dr. David Alan Gilbert wrote:
> > > > * David Hildenbrand ([email protected]) wrote:
> > > > > For virtio-mem, we want to have the plugged/unplugged state of memory
> > > > > blocks available before migrating any actual RAM content, and perform
> > > > > sanity checks before touching anything on the destination. This
> > > > > information is immutable on the migration source while migration is 
> > > > > active,
> > > > > 
> > > > > We want to use this information for proper preallocation support with
> > > > > migration: currently, we don't preallocate memory on the migration 
> > > > > target,
> > > > > and especially with hugetlb, we can easily run out of hugetlb pages 
> > > > > during
> > > > > RAM migration and will crash (SIGBUS) instead of catching this 
> > > > > gracefully
> > > > > via preallocation.
> > > > > 
> > > > > Migrating device state via a vmsd before we start iterating is 
> > > > > currently
> > > > > impossible: the only approach that would be possible is avoiding a 
> > > > > vmsd
> > > > > and migrating state manually during save_setup(), to be restored 
> > > > > during
> > > > > load_state().
> > > > > 
> > > > > Let's allow for migrating device state via a vmsd early, during the
> > > > > setup phase in qemu_savevm_state_setup(). To keep it simple, we
> > > > > indicate applicable vmds's using an "immutable" flag.
> > > > > 
> > > > > Note that only very selected devices (i.e., ones seriously messing 
> > > > > with
> > > > > RAM setup) are supposed to make use of such early state migration.
> > > > > 
> > > > > Signed-off-by: David Hildenbrand <[email protected]>
> > > > > ---
> > > > >   include/migration/vmstate.h |  5 +++++
> > > > >   migration/savevm.c          | 14 ++++++++++++++
> > > > >   2 files changed, 19 insertions(+)
> > > > > 
> > > > > diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> > > > > index ad24aa1934..dd06c3abad 100644
> > > > > --- a/include/migration/vmstate.h
> > > > > +++ b/include/migration/vmstate.h
> > > > > @@ -179,6 +179,11 @@ struct VMStateField {
> > > > >   struct VMStateDescription {
> > > > >       const char *name;
> > > > >       int unmigratable;
> > > > > +    /*
> > > > > +     * The state is immutable while migration is active and is saved
> > > > > +     * during the setup phase, to be restored early on the 
> > > > > destination.
> > > > > +     */
> > > > > +    int immutable;
> > > > 
> > > > A bool would be nicer (as it would for unmigratable above)
> > > 
> > > Yes, I chose an int for consistency with "unmigratable". I can turn that
> > > into a bool.
> > > 
> > > I'd even include a cleanup patch for unmigratable if it wouldn't be ...
> > > 
> > > $ git grep "unmigratable \=" | wc -l
> > > 29
> > 
> > It might be OK if you just change the declaration; I mean '1' is pretty
> > close to true? (I think...)
> > Anyway, at least make the new one a bool.
> 
> Agreed bool is better.  Can we rename it to something like "early_setup"?
> "immutable" isn't clear on its most important attribute (on when it'll be
> migrated).  Meanwhile I'd hope we can comment that explicitly.  I'd go with:
> 
>   /*
>    * This VMSD describes something that should be sent during setup phase
>    * of migration.  It plays similar role as save_setup() for explicitly
>    * registered vmstate entries, the only difference is the vmsd will be
>    * sent right at the start of migration.
>    */
>   bool early_setup;

Let me try some even better wording..

    /*
     * This VMSD describes something that should be sent during setup phase
     * of migration.  It plays similar role as save_setup() for explicitly
     * registered vmstate entries, so it can be seen as a way to describe
     * save_setup() in vmsd structures.
     *
     * One SaveStateEntry should either have the save_setup() specified or
     * the vmsd with early_setup set to true.  It should never have both
     * things set.
     */
    bool early_setup;

There's one tricky thing that we'll send QEMU_VM_SECTION_START for
save_setup() entries but QEMU_VM_SECTION_FULL for vmsd early_setup
entries.

David, do you think we can slightly modify your new version of
vmstate_save() so as to pass in the section_type?  I think it'll be even
cleaner to send QEMU_VM_SECTION_START for the early vmsds too.  I assume
this shouldn't affect your goal and anything else.

> 
> > 
> > > > >       int version_id;
> > > > >       int minimum_version_id;
> > > > >       MigrationPriority priority;
> > > > > diff --git a/migration/savevm.c b/migration/savevm.c
> > > > > index ff2b8d0064..536d6f662b 100644
> > > > > --- a/migration/savevm.c
> > > > > +++ b/migration/savevm.c
> > > > > @@ -1200,6 +1200,15 @@ void qemu_savevm_state_setup(QEMUFile *f)
> > > > >       trace_savevm_state_setup();
> > > > >       QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
> > > > > +        if (se->vmsd && se->vmsd->immutable) {
> > > > > +            ret = vmstate_save(f, se, ms->vmdesc);
> > > > > +            if (ret) {
> > > > > +                qemu_file_set_error(f, ret);
> > > > > +                break;
> > > > > +            }
> > > > > +            continue;
> > > > > +        }
> > > > > +
> > > > 
> > > > Does this give you the ordering you want? i.e. there's no guarantee here
> > > > that immutables come first?
> > > 
> > > Yes, for virtio-mem at least this is fine. There are no real ordering
> > > requirements in regard to save_setup().
> > > 
> > > I guess one could use vmstate priorities to affect the ordering, if
> > > required.
> > > 
> > > So for my use case this is good enough, any suggestions? Thanks.
> > 
> > OK, but consider whether it might be better just to have a separate
> > QTAILQ_FOREACH look in savevm_state_setup that first does all the
> > immutables, and then all the setups.
> 
> After patch 1 the order may not matter iiuc, because each call to the
> immutable vmsds calls the new vmstate_save() which will always send
> QEMU_VM_SECTION_FULL and footers along the vmsd.
> 
> Thanks,
> 
> -- 
> Peter Xu

-- 
Peter Xu


Reply via email to