On Wed, 23 Jul 2025 at 23:20, Fabiano Rosas <faro...@suse.de> wrote:
>
> Peter Maydell <peter.mayd...@linaro.org> writes:
>
> > We don't implement the Debug Communications Channel (DCC), but
> > we do attempt to provide dummy versions of its system registers
> > so that software that tries to access them doesn't fall over.
> >
> > However, we got the tx/rx register definitions wrong. These
> > should be:
> >
> > AArch32:
> >   DBGDTRTX   p14 0 c0 c5 0  (on writes)
> >   DBGDTRRX   p14 0 c0 c5 0  (on reads)
> >
> > AArch64:
> >   DBGDTRTX_EL0  2 3 0 5 0 (on writes)
> >   DBGDTRRX_EL0  2 3 0 5 0 (on reads)
> >   DBGDTR_EL0    2 3 0 4 0 (reads and writes)
> >
> > where DBGDTRTX and DBGDTRRX are effectively different names for the
> > same 32-bit register, which has tx behaviour on writes and rx
> > behaviour on reads.  The AArch64-only DBGDTR_EL0 is a 64-bit wide
> > register whose top and bottom halves map to the DBGDTRRX and DBGDTRTX
> > registers.
> >
> > Currently we have just one cpreg struct, which:
> >  * calls itself DBGDTR_EL0
> >  * uses the DBGDTRTX_EL0/DBGDTRRX_EL0 encoding
> >  * is marked as ARM_CP_STATE_BOTH but has the wrong opc1
> >    value for AArch32
> >  * is implemented as RAZ/WI
> >
> > Correct the encoding so:
> >  * we name the DBGDTRTX/DBGDTRRX register correctly
> >  * we split it into AA64 and AA32 versions so we can get the
> >    AA32 encoding right
> >  * we implement DBGDTR_EL0 at its correct encoding
> >
> > Cc: qemu-sta...@nongnu.org
> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2986
> > Signed-off-by: Peter Maydell <peter.mayd...@linaro.org>
> > Reviewed-by: Richard Henderson <richard.hender...@linaro.org>
> > Message-id: 20250708141049.778361-1-peter.mayd...@linaro.org
> > ---
> >  target/arm/debug_helper.c | 13 +++++++++++--
> >  1 file changed, 11 insertions(+), 2 deletions(-)
> >
> > diff --git a/target/arm/debug_helper.c b/target/arm/debug_helper.c
> > index 69fb1d0d9ff..aee06d4d426 100644
> > --- a/target/arm/debug_helper.c
> > +++ b/target/arm/debug_helper.c
> > @@ -988,11 +988,20 @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
> >        .opc0 = 2, .opc1 = 0, .crn = 0, .crm = 3, .opc2 = 2,
> >        .access = PL1_RW, .accessfn = access_tdcc,
> >        .type = ARM_CP_CONST, .resetvalue = 0 },
> > -    /* DBGDTRTX_EL0/DBGDTRRX_EL0 depend on direction */
> > -    { .name = "DBGDTR_EL0", .state = ARM_CP_STATE_BOTH, .cp = 14,
> > +    /* Architecturally DBGDTRTX is named DBGDTRRX when used for reads */
> > +    { .name = "DBGDTRTX_EL0", .state = ARM_CP_STATE_AA64,
> >        .opc0 = 2, .opc1 = 3, .crn = 0, .crm = 5, .opc2 = 0,
> >        .access = PL0_RW, .accessfn = access_tdcc,
> >        .type = ARM_CP_CONST, .resetvalue = 0 },
> > +    { .name = "DBGDTRTX", .state = ARM_CP_STATE_AA32, .cp = 14,
> > +      .opc1 = 0, .crn = 0, .crm = 5, .opc2 = 0,
> > +      .access = PL0_RW, .accessfn = access_tdcc,
> > +      .type = ARM_CP_CONST, .resetvalue = 0 },
> > +    /* This is AArch64-only and is a combination of DBGDTRTX and DBGDTRRX 
> > */
> > +    { .name = "DBGDTR_EL0", .state = ARM_CP_STATE_AA64,
> > +      .opc0 = 2, .opc1 = 3, .crn = 0, .crm = 4, .opc2 = 0,
> > +      .access = PL0_RW, .accessfn = access_tdcc,
> > +      .type = ARM_CP_CONST, .resetvalue = 0 },
> >      /*
> >       * OSECCR_EL1 provides a mechanism for an operating system
> >       * to access the contents of EDECCR. EDECCR is not implemented though,
>
> Hi, this patch breaks migration. I'm leaving for the day and will take a
> closer look in the morning. But since we have timezones, here it is:

Thanks for the report; I can repro this. It happens because
the loop in cpu_post_load hits the "register in their list but
not ours" check, because the source VM has the AArch32
{.cp = 14, .opc0 = 2, .opc1 = 3, .crn = 0, .crm = 5, .opc2 = 0}
register which should never have existed.

I'm not sure how to handle this, as we have no mechanism for
"ignore this incoming register value, it is bogus". I'm surprised
we've never run into this before...

I won't be able to look at this further til the tail end of
next week.

As an aside, it is a shame that post_load hooks do not
take an Error** -- if they did we would be able to report
this more usefully to the user by saying why the migration
failed instead of just returning -1. Perhaps it would be
worth adding _err versions of the hook fields in
VMStateDescription so that devices can optionally
implement them instead if they have interesting or
complicated errors to report ?

thanks
-- PMM

Reply via email to