On Thu, Mar 12, 2026 at 06:27:05PM +0000, Peter Maydell wrote: > On Thu, 12 Mar 2026 at 18:13, Peter Xu <[email protected]> wrote: > > > > On Thu, Mar 12, 2026 at 07:34:34PM +0300, Vladimir Sementsov-Ogievskiy > > wrote: > > > Hmm. Probably, that this is the subsection, that exist only in our > > > downstream. > > > I don't know, was it mentioned clearly before, sorry if not. > > > > Ah.. Looks like it was not mentioned in the latest version posted: > > > > https://lore.kernel.org/all/[email protected]/ > > > > Let's always mention this, because it's important piece of info, e.g. to > > know upstream has yet no reproducer. > > > > > This subsection just not exist in master, and we are not going to > > > upstream it. > > > So, for upstream it's a "theoretical" bug in a protocol, which may be > > > triggered in > > > some future moment. And that's why there is not specific case in commit > > > message > > > but only assumption "Let's say there is a vmstate ...". > > > > In this case, for upstream we almost only lose but yet unknown gain with > > this fix then, because we stop throwing useful info for unknown new > > subsections, as PeterM pointed out: > > > > https://lore.kernel.org/all/cafeaca_mzxddz_xjpnz9kc4k01aoss_oh_qeol3g33kdsnh...@mail.gmail.com/ > > > > But I agree it is a potential issue, if we cannot justify QEMU can never > > hit it. > > > > Sending "how many subsections are there" is one viable option as you said, > > but it only solves this small problem, and knowing that number requires src > > QEMU looping over the subsections twice so a bit awkward (first time we > > need to kickoff ".needed()" hooks to do the counting; not all will be sent). > > > > The other way is to provide level information somehow in the stream. > > > > Say, we could attach START/END markers for each VMSD to be dumped. That > > may be able to help in other cases too, e.g. when we accidentally grow some > > VMSD fields breaking migration, then IIUC dest QEMU can provide a more > > accurate error message when it knows the bound of current VMSD that is > > being loaded. > > Could we perhaps do something like: > * mark up the vmstate sections we have that don't follow the > "/ is the separator" naming rule with some new struct field > that says "legacy_bad_subsection_name = true" or whatever > (or rename them if there are cases where we're OK with a > migration break, like "iotkit-secctl") > * have the migration code enforce the "subsection name follows the > / separator rule" via an assert when the vmstate is registered, > with a loosening for the ones marked as legacy names > * have the inbound migration code use the strong check on the > separator to distinguish "subsection" from "not subsection" > unless the vmstate it's doing inbound migration for is marked > as needing legacy name handling > ? > > I think that would keep migration compatibility, avoid the problem > of wrong subsection names creeping in in future, and in practice > mean we don't in future hit this issue upstream. (With more effort > it might be also possible to tie the legacy names to QEMU versioned > machines so that they can eventually be retired.)
Yes this is an option. This will cause a few other things to maintain for us, though.. - legacy_bad_subsection_name itself - two different paths of parsing subsections - if we want to get rid of this (i think we should..), we need the machine compat property for legacy_bad_subsection_name to get rid of it after 6 years We'd better also make sure we don't miss something when choosing the new legacy_bad_subsection_name candidates, or it'll start to fail migrations with a more strict check. That so far just sounds slightly more work and less benefit comparing to introducing START/END markers for VMSDs to me. With START/END markers for each VMSD (or something like it), we can actually drop the requirement on "sub-vmsd names to contain parents' vmsd names", because we have the level info, then it's clear which subsection belongs to which parent vmsd. I wished we have that already; we almost have it (QEMU_VM_SECTION_FULL, QEMU_VM_SECTION_FOOTER, etc.) we're very close except it's just that it only works for a whole section hence only top VMSD, rather than every one of them.. Thanks, -- Peter Xu
