On Mon, 26 Jan 2026, Eric Auger wrote:
When migrating ARM guests accross same machines with different host kernels we are likely to encounter failures such as:"failed to load cpu:cpreg_vmstate_array_len" This is due to the fact KVM exposes a different number of registers to qemu on source and destination. When trying to migrate a bigger register set to a smaller one, qemu cannot save the CPU state. For example, recently we faced such kind of situations with: - unconditionnal exposure of KVM_REG_ARM_VENDOR_HYP_BMAP_2 FW pseudo register from v6.16 onwards. Causes backward migration failure. - removal of unconditionnal exposure of TCR2_EL1, PIRE0_EL1, PIR_EL1 from v6.13 onwards. Causes forward migration failure. This situation is really problematic for distributions which want to guarantee forward and backward migration of a given machine type between different releases. While the series mainly targets KVM acceleration, this problem also exists with TCG. For instance some registers may be exposed while they shouldn't. Then it is tricky to fix that situation without breaking forward migration. An example was provided by Peter: 4f2b82f60 ("target/arm: Reinstate bogus AArch32 DBGDTRTX register for migration compat). This series introduces 2 CPU array properties that list - the CPU registers to hide from the exposes sysregs (aims at removing registers from the destination) - The CPU registers that may not exist but which can be found in the incoming migration stream (aims at ignoring extra registers in the incoming state) An example is given to illustrate how those props could be used to apply compats for machine types supposed to "see" the same register set accross various host kernels. Mitigation of DBGDTRTX issue would be achieved by setting x-mig-safe-missing-regs=0x40200000200e0298 which matches AArch32 DBGDTRTX register index. The first patch improves the tracing so that we can quickly detect which registers do not match between the incoming stream and the exposed sysregs
I gave these a spin - works as advertised, no issues found. Tested-by: Sebastian Ott <[email protected]>
