Re: [PATCH] migration: fix parsing snapshots with x-ignore-shared flag

Fabiano Rosas Wed, 26 Nov 2025 04:46:57 -0800

Peter Xu <[email protected]> writes:

> On Tue, Nov 25, 2025 at 06:40:12PM -0300, Fabiano Rosas wrote:
>> Peter Xu <[email protected]> writes:
>> 
>> > On Tue, Nov 25, 2025 at 05:46:49PM +0000, Pawel Zmarzly wrote:
>> >> Snapshots made with mapped-ram and x-ignore-shared flags are
>> >> not parsed properly.
>> >> 
>> >> Signed-off-by: Pawel Zmarzly <[email protected]>
>> >> ---
>> >>  migration/ram.c | 5 +++++
>> >>  1 file changed, 5 insertions(+)
>> >> 
>> >> diff --git a/migration/ram.c b/migration/ram.c
>> >> index 29f016cb25..85fdc810ab 100644
>> >> --- a/migration/ram.c
>> >> +++ b/migration/ram.c
>> >> @@ -4277,6 +4277,11 @@ static int parse_ramblocks(QEMUFile *f, ram_addr_t 
>> >> total_ram_bytes)
>> >>          id[len] = 0;
>> >>          length = qemu_get_be64(f);
>> >>  
>> >> +        if (migrate_ignore_shared()) {
>> >> +            /* Read and discard the x-ignore-shared memory region 
>> >> address */
>> >> +            qemu_get_be64(f);
>> >> +        }
>> >> +
>> >>          block = qemu_ram_block_by_name(id);
>> >>          if (block) {
>> >>              ret = parse_ramblock(f, block, length);
>> >> -- 
>> >> 2.52.0
>> >> 
>> >
>> > Thanks for the patch, though the u64 was parsed in parse_ramblock()
>> > instead.  Would you consider refactoring that function instead?
>> 
>> There's actually not much going on in terms of "parsing" in
>> parse_ramblock(). I think we could move the migrate_ignore_shared() from
>> the end of the function to before the mapped-ram check().
>
> Yes, that's also what I meant if it wasn't clear.. it was parsed into a
> hwaddr, and it was used to verify the addresses match.
>
> If that check is needed for ignore-shared blocks, then these checks should
> also apply when mapped-ram is enabled on top of whatever ramblock got
> ignored during migration.
>


Right, because ignore_shared implies putting the MR address in the
stream, but there is still the matter of whether the pages will actually
be read on the destination.

Should ram_save_setup() use RAMBLOCK_FOREACH_NOT_IGNORED instead of
RAMBLOCK_FOREACH_MIGRATABLE? I don't immediately see why not...

> Since the discussion started, I am actually not sure if we do this all
> right for two things..
>
> (1) When mapped-ram is enabled, do we actually need to setup those
>     ramblocks in mapped_ram_setup_ramblock()?
>
>     That is, when a ramblock returns migrate_ram_is_ignored()==true, IIUC
>     we don't need to allocate bitmap or page chunks for it?
>
>     We likely don't need to change this easily, because this will change
>     file format.. I'm also not sure if this is a major issue, logically
>     when ignore-shared is used we normally shouldn't need mapped-ram.. vice
>     versa.  So I may need to better understand the use case first on
>     enabling the two..
>

Thanks, Peter. A keen eye, as usual. I was searching for this reasoning
when looking at the code, but it missed me.

I looked back at the very first version of fixed-ram, which wasn't
authored by me and there is indeed no mention or expectation of handling
shared ram. So I think this is at this point unspecified.

What is the current impact of having those pages in? We're "just"
wasting cycles writing to the file, AFAICS. We'd better, at least,
sanitise that part to avoid the extra work.

> (2) Is the check proper on validating mr->addr didn't change?
>
>     This is a question on the check itself when ignore-shared enabled,
>     with/without mapped-ram enabled.  That is, I question whether this
>     check is useful or valid at all:
>     
>     if (migrate_ignore_shared()) {
>         hwaddr addr = qemu_get_be64(f);
>         if (migrate_ram_is_ignored(block) &&
>             block->mr->addr != addr) {
>             error_report("Mismatched GPAs for block %s "
>                          "%" PRId64 "!= %" PRId64, block->idstr,
>                          (uint64_t)addr, (uint64_t)block->mr->addr);
>             return -EINVAL;
>         }
>     }
>

I agree with all you say below, but I think an earlier question would
be: why put the address on the stream in the first place? Is this just
hardening of some sort?

The commit the introduces the feature has me wondering:

fbd162e629 ("migration: Add an ability to ignore shared RAM blocks")

  during save:
  
  +        if (migrate_ignore_shared()) {
  +            qemu_put_be64(f, block->mr->addr);
  +            qemu_put_byte(f, ramblock_is_ignored(block) ? 1 : 0);
  +        }

  during load:
  +    if (ramblock_is_ignored(block)) {
           error_report("block %s should not be migrated !", id);
           return NULL;
       }

If we know it's ignored, why send anything at all? (also, "to ignore"
has a meaning, we should stick to it)

>     In the error, it said "GPA", but mr->addr isn't GPA.. it's the offset
>     of the MR within the MR's parent container MR..  So if the parent is
>     the root MR / system_memory, then it is the GPA, however I don't see it
>     guaranteed..
>

Looking at the initial commit, I think this is all sanity check, maybe
to ensure some sort of stream compatibility. Or to make sure the stream
is stateful and we're not confusing an ignored block with an
(incorrectly) ignored one.

>     My gut feeling is we almost always rely on proper QEMU cmdlines anyway
>     to make migration work.  I wonder if we should just remove this check
>     (in case it might break when mr's parent isn't the root MR).
>

I believe it's ok to remove the check. I wish we had a compatible way to
remove handling of ignored blocks altogether, but I guess this u64 is
now on the stream forever?

> This is irrelevant of this specific fix, so it doesn't need to block a
> repost..

Re: [PATCH] migration: fix parsing snapshots with x-ignore-shared flag

Reply via email to