Hi Peter, Richard,

On 1/26/26 5:52 PM, Eric Auger wrote:
> When migrating ARM guests accross same machines with different host
> kernels we are likely to encounter failures such as:
>
> "failed to load cpu:cpreg_vmstate_array_len"
>
> This is due to the fact KVM exposes a different number of registers
> to qemu on source and destination. When trying to migrate a bigger
> register set to a smaller one, qemu cannot save the CPU state.
>
> For example, recently we faced such kind of situations with:
> - unconditionnal exposure of KVM_REG_ARM_VENDOR_HYP_BMAP_2 FW pseudo
>   register from v6.16 onwards. Causes backward migration failure.
> - removal of unconditionnal exposure of TCR2_EL1, PIRE0_EL1, PIR_EL1
>   from v6.13 onwards. Causes forward migration failure.
>
> This situation is really problematic for distributions which want to
> guarantee forward and backward migration of a given machine type
> between different releases.
>
> While the series mainly targets KVM acceleration, this problem
> also exists with TCG. For instance some registers may be exposed
> while they shouldn't. Then it is tricky to fix that situation 
> without breaking forward migration. An example was provided by
> Peter: 4f2b82f60 ("target/arm: Reinstate bogus AArch32 DBGDTRTX
> register for migration compat). 
>
> This series introduces 2 CPU array properties that list
> - the CPU registers to hide from the exposes sysregs (aims
>   at removing registers from the destination)
> - The CPU registers that may not exist but which can be found
>   in the incoming migration stream (aims at ignoring extra
>   registers in the incoming state)
>
> An example is given to illustrate how those props
> could be used to apply compats for machine types supposed to "see" the
> same register set accross various host kernels.
>
> Mitigation of DBGDTRTX issue would be achieved by setting
> x-mig-safe-missing-regs=0x40200000200e0298 which matches
> AArch32 DBGDTRTX register index.
>
> The first patch improves the tracing so that we can quickly detect
> which registers do not match between the incoming stream and the
> exposed sysregs

Most of the patches of the series have collected R-bs. Do you have
concerns with the approach?

This aims at solving distro real life issues wrt cross kernel migration
failures and we would appreciate to get a generic solution within 11.0
timeframe.
Also [PATCH v4 0/2] arm: add kvm-psci-version vcpu property 
(https://lore.kernel.org/all/[email protected]/)
is part of this initiative and also collected R-bs/T-bs.

Looking forward to your feedbacks.

Eric 
>
> ---
>
> Available at:
> https://github.com/eauger/qemu/tree/mitig-v6
>
> ---
>
> Tests:
> - migration with 10.2 machine with old qemu featuring DBGDTRTX
>   and this one where it is removed. Forward migration works.
>   backward doesn't because the register is not present in the
>   input migration stream and write_list_to_cpustate() fails
>   while write_raw_cp_reg and reading it back. write_raw_cp_reg()
>   seems to read an unintialized values from  cpu->cpreg_values[i].
>   write has no effect since type is ARM_CP_CONST but read_raw_cp_reg
>   returns ri->resetvalue which differs from uninitialized value.
>   I would have expected the initial cpu->cpreg_values[i] to match
>   reset value which is obviously not the case. Laso the comment hints
>   that it should be. So maybe another issue? Nevertheless I am
>   not totally sure supporting backward migration for TCG is a must.
>   This may be fixed separately if it is confirmed this is a bug.
>
> - migration with accel=kvm back and forth old host/qemu where
>   host does not feature fixes for TCR2_EL1, PIRE0_EL1, PIR_EL1
>   and recent KVM_REG_ARM_VENDOR_HYP_BMAP_2 FW and more recent
>   kernel/this qemu that feature them. Migration works forward
>   and backward with 10.1 machine type.
>
> History:
>
> v5 -> v6:
> - move GString init and collected Sebastian's R-b
>
> v4 -> v5:
> - Fixed issue reported by Sebastian about aggregated array
>   props. This lead to the introduction of
>   hw/arm/virt: Introduce framework to aggregate hidden-regs
>   and safe-missing-regs
> - Collected additional hacks from Connie
>
> v3 -> v4:
> - Collected Connie's & Sebastian's R-bs
> - Squashed patches 3 and 5
> - various typos and rewording
>
> v2 -> v3:
> - revert target/arm: Reinstate bogus AArch32 DBGDTRTX register for migration 
> compat
> - fix some typos and rework target/arm/cpu.h hidden_regs comment (Connie)
> - Even for TCG we use KVM index
>
> v1 -> v2:
> - fixed typos (Connie)
> - Make it less KVM specific (tentative hidding of TCG regs, not
>   tested)
> - Tested DBGDTRTX TCG case reported by Peter
> - No change to the property format yet. Ran out of idea. However
>   I changed the name of the property with x-mig prefix
> - Changed the terminology, kept hidding but remove fake which was
>   confusing
> - Simplified the logic for regs missing in the incoming stream and
>   do not check anymore they are exposed on dest
>
>
> Eric Auger (11):
>   hw/arm/virt: Rename arm_virt_compat into arm_virt_compat_defaults
>   target/arm/machine: Improve traces on register mismatch during
>     migration
>   target/arm/cpu: Allow registers to be hidden
>   target/arm/machine: Allow extra regs in the incoming stream
>   kvm-all: Enforce hidden regs are never accessed
>   target/arm/cpu: Implement hide_reg callback()
>   target/arm/cpu: Expose x-mig-hidden-regs and x-mig-safe-missing-regs
>     properties
>   hw/arm/virt: Declare AArch32 DBGDTRTX as safe to ignore in incoming
>     stream
>   Revert "target/arm: Reinstate bogus AArch32 DBGDTRTX register for
>     migration compat"
>   hw/arm/virt: Introduce framework to aggregate hidden-regs and
>     safe-missing-regs
>   hw/arm/virt: [DO NOT UPSTREAM] Enforce compatibility with older
>     kernels
>
>  include/hw/arm/virt.h     | 23 ++++++++++
>  include/hw/core/cpu.h     |  2 +
>  target/arm/cpu.h          | 48 +++++++++++++++++++++
>  accel/kvm/kvm-all.c       | 12 ++++++
>  hw/arm/virt.c             | 89 ++++++++++++++++++++++++++++++++++++---
>  target/arm/cpu.c          | 11 +++++
>  target/arm/debug_helper.c | 29 -------------
>  target/arm/helper.c       | 12 +++++-
>  target/arm/kvm.c          | 35 ++++++++++++++-
>  target/arm/machine.c      | 70 +++++++++++++++++++++++++++---
>  target/arm/trace-events   | 10 +++++
>  11 files changed, 298 insertions(+), 43 deletions(-)
>


Reply via email to