Fabiano Rosas <faro...@suse.de> writes:

> Peter Maydell <peter.mayd...@linaro.org> writes:
>
>> On Wed, 23 Jul 2025 at 23:20, Fabiano Rosas <faro...@suse.de> wrote:
>>>
>>> Peter Maydell <peter.mayd...@linaro.org> writes:
>>>
>>> > We don't implement the Debug Communications Channel (DCC), but
>>> > we do attempt to provide dummy versions of its system registers
>>> > so that software that tries to access them doesn't fall over.
>>> >
>>> > However, we got the tx/rx register definitions wrong. These
>>> > should be:
>>> >
>>> > AArch32:
>>> >   DBGDTRTX   p14 0 c0 c5 0  (on writes)
>>> >   DBGDTRRX   p14 0 c0 c5 0  (on reads)
>>> >
>>> > AArch64:
>>> >   DBGDTRTX_EL0  2 3 0 5 0 (on writes)
>>> >   DBGDTRRX_EL0  2 3 0 5 0 (on reads)
>>> >   DBGDTR_EL0    2 3 0 4 0 (reads and writes)
>>> >
>>> > where DBGDTRTX and DBGDTRRX are effectively different names for the
>>> > same 32-bit register, which has tx behaviour on writes and rx
>>> > behaviour on reads.  The AArch64-only DBGDTR_EL0 is a 64-bit wide
>>> > register whose top and bottom halves map to the DBGDTRRX and DBGDTRTX
>>> > registers.
>>> >
>>> > Currently we have just one cpreg struct, which:
>>> >  * calls itself DBGDTR_EL0
>>> >  * uses the DBGDTRTX_EL0/DBGDTRRX_EL0 encoding
>>> >  * is marked as ARM_CP_STATE_BOTH but has the wrong opc1
>>> >    value for AArch32
>>> >  * is implemented as RAZ/WI
>>> >
>>> > Correct the encoding so:
>>> >  * we name the DBGDTRTX/DBGDTRRX register correctly
>>> >  * we split it into AA64 and AA32 versions so we can get the
>>> >    AA32 encoding right
>>> >  * we implement DBGDTR_EL0 at its correct encoding
>>> >
>>> > Cc: qemu-sta...@nongnu.org
>>> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2986
>>> > Signed-off-by: Peter Maydell <peter.mayd...@linaro.org>
>>> > Reviewed-by: Richard Henderson <richard.hender...@linaro.org>
>>> > Message-id: 20250708141049.778361-1-peter.mayd...@linaro.org
>>> > ---
>>> >  target/arm/debug_helper.c | 13 +++++++++++--
>>> >  1 file changed, 11 insertions(+), 2 deletions(-)
>>> >
>>> > diff --git a/target/arm/debug_helper.c b/target/arm/debug_helper.c
>>> > index 69fb1d0d9ff..aee06d4d426 100644
>>> > --- a/target/arm/debug_helper.c
>>> > +++ b/target/arm/debug_helper.c
>>> > @@ -988,11 +988,20 @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
>>> >        .opc0 = 2, .opc1 = 0, .crn = 0, .crm = 3, .opc2 = 2,
>>> >        .access = PL1_RW, .accessfn = access_tdcc,
>>> >        .type = ARM_CP_CONST, .resetvalue = 0 },
>>> > -    /* DBGDTRTX_EL0/DBGDTRRX_EL0 depend on direction */
>>> > -    { .name = "DBGDTR_EL0", .state = ARM_CP_STATE_BOTH, .cp = 14,
>>> > +    /* Architecturally DBGDTRTX is named DBGDTRRX when used for reads */
>>> > +    { .name = "DBGDTRTX_EL0", .state = ARM_CP_STATE_AA64,
>>> >        .opc0 = 2, .opc1 = 3, .crn = 0, .crm = 5, .opc2 = 0,
>>> >        .access = PL0_RW, .accessfn = access_tdcc,
>>> >        .type = ARM_CP_CONST, .resetvalue = 0 },
>>> > +    { .name = "DBGDTRTX", .state = ARM_CP_STATE_AA32, .cp = 14,
>>> > +      .opc1 = 0, .crn = 0, .crm = 5, .opc2 = 0,
>>> > +      .access = PL0_RW, .accessfn = access_tdcc,
>>> > +      .type = ARM_CP_CONST, .resetvalue = 0 },
>>> > +    /* This is AArch64-only and is a combination of DBGDTRTX and 
>>> > DBGDTRRX */
>>> > +    { .name = "DBGDTR_EL0", .state = ARM_CP_STATE_AA64,
>>> > +      .opc0 = 2, .opc1 = 3, .crn = 0, .crm = 4, .opc2 = 0,
>>> > +      .access = PL0_RW, .accessfn = access_tdcc,
>>> > +      .type = ARM_CP_CONST, .resetvalue = 0 },
>>> >      /*
>>> >       * OSECCR_EL1 provides a mechanism for an operating system
>>> >       * to access the contents of EDECCR. EDECCR is not implemented 
>>> > though,
>>>
>>> Hi, this patch breaks migration. I'm leaving for the day and will take a
>>> closer look in the morning. But since we have timezones, here it is:
>>
>> Thanks for the report; I can repro this. It happens because
>> the loop in cpu_post_load hits the "register in their list but
>> not ours" check, because the source VM has the AArch32
>> {.cp = 14, .opc0 = 2, .opc1 = 3, .crn = 0, .crm = 5, .opc2 = 0}
>> register which should never have existed.
>>
>
> My debugging (in dst) shows:
>
> (gdb) x/16x 0x555557ad5d20 //cpu->cpreg_vmstate_indexes[0x18 to 0x1c]
> 0x555557ad5d20: 0x200e0205      0x40200000      0x200e0284<     0x40200000
> 0x555557ad5d30: 0x200e0285      0x40200000      0x200e0298      0x40200000
> 0x555557ad5d40: 0x200e0302      0x40200000      0x200e0380      0x40200000
> 0x555557ad5d50: 0x200e0800      0x40200000      0x200e0838      0x40200000
>
> (gdb) x/16x 0x555557ad4ac0 //cpu->cpreg_indexes[0x18 to 0x1c]
> p0x555557ad4ac0: 0x200e0205      0x40200000      0x200e0280<     0x40200000
> 0x555557ad4ad0: 0x200e0284      0x40200000      0x200e0285      0x40200000
> 0x555557ad4ae0: 0x200e0302      0x40200000      0x200e0380      0x40200000
> 0x555557ad4af0: 0x200e0800      0x40200000      0x200e0838      0x40200000
>
>
>> I'm not sure how to handle this, as we have no mechanism for
>> "ignore this incoming register value, it is bogus". I'm surprised
>> we've never run into this before...
>>
>
> I was thinking the same.
>
> I actually don't understand what the encoding in cpu->cpreg_indexes is
> supposed to represent. How does comparing the indexes implies "in our
> list"/"in their list"? Is there some sort of ISA level assumption?
>
>        if (cpu->cpreg_vmstate_indexes[v] > cpu->cpreg_indexes[i]) {
>             /* register in our list but not incoming : skip it */
>             continue;
>         }
>         if (cpu->cpreg_vmstate_indexes[v] < cpu->cpreg_indexes[i]) {
>             /* register in their list but not ours: fail migration */
>             return -1;
>         }
>  

Ok, I spotted the sorting now. 

Reply via email to