Fabiano Rosas <faro...@suse.de> writes: > Peter Maydell <peter.mayd...@linaro.org> writes: > >> On Wed, 23 Jul 2025 at 23:20, Fabiano Rosas <faro...@suse.de> wrote: >>> >>> Peter Maydell <peter.mayd...@linaro.org> writes: >>> >>> > We don't implement the Debug Communications Channel (DCC), but >>> > we do attempt to provide dummy versions of its system registers >>> > so that software that tries to access them doesn't fall over. >>> > >>> > However, we got the tx/rx register definitions wrong. These >>> > should be: >>> > >>> > AArch32: >>> > DBGDTRTX p14 0 c0 c5 0 (on writes) >>> > DBGDTRRX p14 0 c0 c5 0 (on reads) >>> > >>> > AArch64: >>> > DBGDTRTX_EL0 2 3 0 5 0 (on writes) >>> > DBGDTRRX_EL0 2 3 0 5 0 (on reads) >>> > DBGDTR_EL0 2 3 0 4 0 (reads and writes) >>> > >>> > where DBGDTRTX and DBGDTRRX are effectively different names for the >>> > same 32-bit register, which has tx behaviour on writes and rx >>> > behaviour on reads. The AArch64-only DBGDTR_EL0 is a 64-bit wide >>> > register whose top and bottom halves map to the DBGDTRRX and DBGDTRTX >>> > registers. >>> > >>> > Currently we have just one cpreg struct, which: >>> > * calls itself DBGDTR_EL0 >>> > * uses the DBGDTRTX_EL0/DBGDTRRX_EL0 encoding >>> > * is marked as ARM_CP_STATE_BOTH but has the wrong opc1 >>> > value for AArch32 >>> > * is implemented as RAZ/WI >>> > >>> > Correct the encoding so: >>> > * we name the DBGDTRTX/DBGDTRRX register correctly >>> > * we split it into AA64 and AA32 versions so we can get the >>> > AA32 encoding right >>> > * we implement DBGDTR_EL0 at its correct encoding >>> > >>> > Cc: qemu-sta...@nongnu.org >>> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2986 >>> > Signed-off-by: Peter Maydell <peter.mayd...@linaro.org> >>> > Reviewed-by: Richard Henderson <richard.hender...@linaro.org> >>> > Message-id: 20250708141049.778361-1-peter.mayd...@linaro.org >>> > --- >>> > target/arm/debug_helper.c | 13 +++++++++++-- >>> > 1 file changed, 11 insertions(+), 2 deletions(-) >>> > >>> > diff --git a/target/arm/debug_helper.c b/target/arm/debug_helper.c >>> > index 69fb1d0d9ff..aee06d4d426 100644 >>> > --- a/target/arm/debug_helper.c >>> > +++ b/target/arm/debug_helper.c >>> > @@ -988,11 +988,20 @@ static const ARMCPRegInfo debug_cp_reginfo[] = { >>> > .opc0 = 2, .opc1 = 0, .crn = 0, .crm = 3, .opc2 = 2, >>> > .access = PL1_RW, .accessfn = access_tdcc, >>> > .type = ARM_CP_CONST, .resetvalue = 0 }, >>> > - /* DBGDTRTX_EL0/DBGDTRRX_EL0 depend on direction */ >>> > - { .name = "DBGDTR_EL0", .state = ARM_CP_STATE_BOTH, .cp = 14, >>> > + /* Architecturally DBGDTRTX is named DBGDTRRX when used for reads */ >>> > + { .name = "DBGDTRTX_EL0", .state = ARM_CP_STATE_AA64, >>> > .opc0 = 2, .opc1 = 3, .crn = 0, .crm = 5, .opc2 = 0, >>> > .access = PL0_RW, .accessfn = access_tdcc, >>> > .type = ARM_CP_CONST, .resetvalue = 0 }, >>> > + { .name = "DBGDTRTX", .state = ARM_CP_STATE_AA32, .cp = 14, >>> > + .opc1 = 0, .crn = 0, .crm = 5, .opc2 = 0, >>> > + .access = PL0_RW, .accessfn = access_tdcc, >>> > + .type = ARM_CP_CONST, .resetvalue = 0 }, >>> > + /* This is AArch64-only and is a combination of DBGDTRTX and >>> > DBGDTRRX */ >>> > + { .name = "DBGDTR_EL0", .state = ARM_CP_STATE_AA64, >>> > + .opc0 = 2, .opc1 = 3, .crn = 0, .crm = 4, .opc2 = 0, >>> > + .access = PL0_RW, .accessfn = access_tdcc, >>> > + .type = ARM_CP_CONST, .resetvalue = 0 }, >>> > /* >>> > * OSECCR_EL1 provides a mechanism for an operating system >>> > * to access the contents of EDECCR. EDECCR is not implemented >>> > though, >>> >>> Hi, this patch breaks migration. I'm leaving for the day and will take a >>> closer look in the morning. But since we have timezones, here it is: >> >> Thanks for the report; I can repro this. It happens because >> the loop in cpu_post_load hits the "register in their list but >> not ours" check, because the source VM has the AArch32 >> {.cp = 14, .opc0 = 2, .opc1 = 3, .crn = 0, .crm = 5, .opc2 = 0} >> register which should never have existed. >> > > My debugging (in dst) shows: > > (gdb) x/16x 0x555557ad5d20 //cpu->cpreg_vmstate_indexes[0x18 to 0x1c] > 0x555557ad5d20: 0x200e0205 0x40200000 0x200e0284< 0x40200000 > 0x555557ad5d30: 0x200e0285 0x40200000 0x200e0298 0x40200000 > 0x555557ad5d40: 0x200e0302 0x40200000 0x200e0380 0x40200000 > 0x555557ad5d50: 0x200e0800 0x40200000 0x200e0838 0x40200000 > > (gdb) x/16x 0x555557ad4ac0 //cpu->cpreg_indexes[0x18 to 0x1c] > p0x555557ad4ac0: 0x200e0205 0x40200000 0x200e0280< 0x40200000 > 0x555557ad4ad0: 0x200e0284 0x40200000 0x200e0285 0x40200000 > 0x555557ad4ae0: 0x200e0302 0x40200000 0x200e0380 0x40200000 > 0x555557ad4af0: 0x200e0800 0x40200000 0x200e0838 0x40200000 > > >> I'm not sure how to handle this, as we have no mechanism for >> "ignore this incoming register value, it is bogus". I'm surprised >> we've never run into this before... >> > > I was thinking the same. > > I actually don't understand what the encoding in cpu->cpreg_indexes is > supposed to represent. How does comparing the indexes implies "in our > list"/"in their list"? Is there some sort of ISA level assumption? > > if (cpu->cpreg_vmstate_indexes[v] > cpu->cpreg_indexes[i]) { > /* register in our list but not incoming : skip it */ > continue; > } > if (cpu->cpreg_vmstate_indexes[v] < cpu->cpreg_indexes[i]) { > /* register in their list but not ours: fail migration */ > return -1; > } >
Ok, I spotted the sorting now.