On 20/01/2022 09:14, Christophe Lyon wrote:


On Wed, Jan 19, 2022 at 7:18 PM Andre Vieira (lists) via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:

    Hi Christophe,

    On 13/01/2022 14:56, Christophe Lyon via Gcc-patches wrote:
    > At some point during the development of this patch series, it
    appeared
    > that in some cases the register allocator wants “VPR or general”
    > rather than “VPR or general or FP” (which is the same thing as
    > ALL_REGS).  The series does not seem to require this anymore, but it
    > seems to be a good thing to do anyway, to give the register
    allocator
    > more freedom.
    Not sure I fully understand this, but I guess it creates an extra
    class
    the register allocator can use to group things that can go into
    VPR or
    general reg?
    >
    > CLASS_MAX_NREGS and arm_hard_regno_nregs need adjustment to avoid a
    > regression in gcc.dg/stack-usage-1.c when compiled with -mthumb
    > -mfloat-abi=hard -march=armv8.1-m.main+mve.fp+fp.dp.
    I have not looked into this failure, but ...
    >
    > 2022-01-13  Christophe Lyon  <christophe.l...@foss.st.com>
    >
    >       gcc/
    >       * config/arm/arm.h (reg_class): Add GENERAL_AND_VPR_REGS.
    >       (REG_CLASS_NAMES): Likewise.
    >       (REG_CLASS_CONTENTS): Likewise.
    >       (CLASS_MAX_NREGS): Handle VPR.
    >       * config/arm/arm.c (arm_hard_regno_nregs): Handle VPR.
    >
    > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
    > index bb75921f32d..c3559ca8703 100644
    > --- a/gcc/config/arm/arm.c
    > +++ b/gcc/config/arm/arm.c
    > @@ -25287,6 +25287,9 @@ thumb2_asm_output_opcode (FILE * stream)
    >   static unsigned int
    >   arm_hard_regno_nregs (unsigned int regno, machine_mode mode)
    >   {
    > +  if (IS_VPR_REGNUM (regno))
    > +    return CEIL (GET_MODE_SIZE (mode), 2);
    When do we ever want to use more than 1 register for VPR?


That was tricky.
Richard Sandiford helped me analyze the problem, I guess I can quote him:

RS> I think the problem is a combination of a few things:
RS>
RS> (1) arm_hard_regno_mode_ok rejects SImode in VPR, so SImode moves
RS>     to or from the VPR_REG class get the maximum cost.
RS>
RS> (2) IRA thinks from CLASS_MAX_NREGS and arm_hard_regno_nregs that
RS>    VPR is big enough to hold SImode.
RS>
RS> (3) If a class C1 is a superset of a class C2, and if C2 is big enough
RS>     to hold a mode M, IRA ensures that move costs for M involving C1
RS>     are >= move costs for M involving C2.
RS>
RS> (1) is correct but (2) isn't.  IMO (3) is dubious: the trigger should
RS> be whether C2 is actually allowed to hold M, not whether C2 is big enough RS> to hold M.  However, changing that is likely to cause problems elsewhere,
RS> and could lead to classes like GENERAL_AND_FP_REGS being used when
RS> FP_REGS are disabled (which might be confusing).
RS>

I understand everything up until here.

RS> “Fixing” (2) using:
RS>
RS>  CEIL (GET_MODE_SIZE (mode), 2)
I was wondering why not just return '1' for VPR_REGNUM, rather than use the fact that the mode-size we use for VPR is 2 bytes, so diving it by 2 makes 1. Unless we ever decide to use a larger mode for VPR, maybe that's what this is trying to address? I can't imagine we would ever need to though since for MVE there is only one VPR register and it is always 16-bits. Just feels overly complicated to me.
RS>
RS> for VPR_REG & VPR_REGNUM seems to make the costs correct.  I don't know
RS> if it would cause other problems though.
RS>
RS> I don't think CLASS_MAX_NREGS should do anything special for superclasses
RS> of VPR_REG, even though that makes the definition non-obvious.  If an
RS> SImode is stored in GENERAL_AND_VPR_REGS, it will in reality be stored
RS> in the GENERAL_REGS subset, so the maximum count should come from there
RS> rather than VPR_REG.

Does that answer your question?
I guess it end's up being correct, just don't understand the complexity that's all.

Reply via email to