On 20/01/2022 09:14, Christophe Lyon wrote:
On Wed, Jan 19, 2022 at 7:18 PM Andre Vieira (lists) via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
Hi Christophe,
On 13/01/2022 14:56, Christophe Lyon via Gcc-patches wrote:
> At some point during the development of this patch series, it
appeared
> that in some cases the register allocator wants “VPR or general”
> rather than “VPR or general or FP” (which is the same thing as
> ALL_REGS). The series does not seem to require this anymore, but it
> seems to be a good thing to do anyway, to give the register
allocator
> more freedom.
Not sure I fully understand this, but I guess it creates an extra
class
the register allocator can use to group things that can go into
VPR or
general reg?
>
> CLASS_MAX_NREGS and arm_hard_regno_nregs need adjustment to avoid a
> regression in gcc.dg/stack-usage-1.c when compiled with -mthumb
> -mfloat-abi=hard -march=armv8.1-m.main+mve.fp+fp.dp.
I have not looked into this failure, but ...
>
> 2022-01-13 Christophe Lyon <christophe.l...@foss.st.com>
>
> gcc/
> * config/arm/arm.h (reg_class): Add GENERAL_AND_VPR_REGS.
> (REG_CLASS_NAMES): Likewise.
> (REG_CLASS_CONTENTS): Likewise.
> (CLASS_MAX_NREGS): Handle VPR.
> * config/arm/arm.c (arm_hard_regno_nregs): Handle VPR.
>
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index bb75921f32d..c3559ca8703 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -25287,6 +25287,9 @@ thumb2_asm_output_opcode (FILE * stream)
> static unsigned int
> arm_hard_regno_nregs (unsigned int regno, machine_mode mode)
> {
> + if (IS_VPR_REGNUM (regno))
> + return CEIL (GET_MODE_SIZE (mode), 2);
When do we ever want to use more than 1 register for VPR?
That was tricky.
Richard Sandiford helped me analyze the problem, I guess I can quote him:
RS> I think the problem is a combination of a few things:
RS>
RS> (1) arm_hard_regno_mode_ok rejects SImode in VPR, so SImode moves
RS> to or from the VPR_REG class get the maximum cost.
RS>
RS> (2) IRA thinks from CLASS_MAX_NREGS and arm_hard_regno_nregs that
RS> VPR is big enough to hold SImode.
RS>
RS> (3) If a class C1 is a superset of a class C2, and if C2 is big enough
RS> to hold a mode M, IRA ensures that move costs for M involving C1
RS> are >= move costs for M involving C2.
RS>
RS> (1) is correct but (2) isn't. IMO (3) is dubious: the trigger should
RS> be whether C2 is actually allowed to hold M, not whether C2 is big
enough
RS> to hold M. However, changing that is likely to cause problems
elsewhere,
RS> and could lead to classes like GENERAL_AND_FP_REGS being used when
RS> FP_REGS are disabled (which might be confusing).
RS>
I understand everything up until here.
RS> “Fixing” (2) using:
RS>
RS> CEIL (GET_MODE_SIZE (mode), 2)
I was wondering why not just return '1' for VPR_REGNUM, rather than use
the fact that the mode-size we use for VPR is 2 bytes, so diving it by 2
makes 1. Unless we ever decide to use a larger mode for VPR, maybe
that's what this is trying to address? I can't imagine we would ever
need to though since for MVE there is only one VPR register and it is
always 16-bits. Just feels overly complicated to me.
RS>
RS> for VPR_REG & VPR_REGNUM seems to make the costs correct. I don't
know
RS> if it would cause other problems though.
RS>
RS> I don't think CLASS_MAX_NREGS should do anything special for
superclasses
RS> of VPR_REG, even though that makes the definition non-obvious. If an
RS> SImode is stored in GENERAL_AND_VPR_REGS, it will in reality be stored
RS> in the GENERAL_REGS subset, so the maximum count should come from
there
RS> rather than VPR_REG.
Does that answer your question?
I guess it end's up being correct, just don't understand the complexity
that's all.