On 20/01/2022 10:40, Richard Sandiford wrote:
"Andre Vieira (lists)" <andre.simoesdiasvie...@arm.com> writes:
On 20/01/2022 09:14, Christophe Lyon wrote:
On Wed, Jan 19, 2022 at 7:18 PM Andre Vieira (lists) via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
Hi Christophe,
On 13/01/2022 14:56, Christophe Lyon via Gcc-patches wrote:
> At some point during the development of this patch series, it
appeared
> that in some cases the register allocator wants “VPR or general”
> rather than “VPR or general or FP” (which is the same thing as
> ALL_REGS). The series does not seem to require this anymore, but it
> seems to be a good thing to do anyway, to give the register
allocator
> more freedom.
Not sure I fully understand this, but I guess it creates an extra
class
the register allocator can use to group things that can go into
VPR or
general reg?
>
> CLASS_MAX_NREGS and arm_hard_regno_nregs need adjustment to avoid a
> regression in gcc.dg/stack-usage-1.c when compiled with -mthumb
> -mfloat-abi=hard -march=armv8.1-m.main+mve.fp+fp.dp.
I have not looked into this failure, but ...
>
> 2022-01-13 Christophe Lyon <christophe.l...@foss.st.com>
>
> gcc/
> * config/arm/arm.h (reg_class): Add GENERAL_AND_VPR_REGS.
> (REG_CLASS_NAMES): Likewise.
> (REG_CLASS_CONTENTS): Likewise.
> (CLASS_MAX_NREGS): Handle VPR.
> * config/arm/arm.c (arm_hard_regno_nregs): Handle VPR.
>
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index bb75921f32d..c3559ca8703 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -25287,6 +25287,9 @@ thumb2_asm_output_opcode (FILE * stream)
> static unsigned int
> arm_hard_regno_nregs (unsigned int regno, machine_mode mode)
> {
> + if (IS_VPR_REGNUM (regno))
> + return CEIL (GET_MODE_SIZE (mode), 2);
When do we ever want to use more than 1 register for VPR?
That was tricky.
Richard Sandiford helped me analyze the problem, I guess I can quote him:
RS> I think the problem is a combination of a few things:
RS>
RS> (1) arm_hard_regno_mode_ok rejects SImode in VPR, so SImode moves
RS> to or from the VPR_REG class get the maximum cost.
RS>
RS> (2) IRA thinks from CLASS_MAX_NREGS and arm_hard_regno_nregs that
RS> VPR is big enough to hold SImode.
RS>
RS> (3) If a class C1 is a superset of a class C2, and if C2 is big enough
RS> to hold a mode M, IRA ensures that move costs for M involving C1
RS> are >= move costs for M involving C2.
RS>
RS> (1) is correct but (2) isn't. IMO (3) is dubious: the trigger should
RS> be whether C2 is actually allowed to hold M, not whether C2 is big
enough
RS> to hold M. However, changing that is likely to cause problems
elsewhere,
RS> and could lead to classes like GENERAL_AND_FP_REGS being used when
RS> FP_REGS are disabled (which might be confusing).
RS>
I understand everything up until here.
RS> “Fixing” (2) using:
RS>
RS> CEIL (GET_MODE_SIZE (mode), 2)
I was wondering why not just return '1' for VPR_REGNUM, rather than use
the fact that the mode-size we use for VPR is 2 bytes, so diving it by 2
makes 1. Unless we ever decide to use a larger mode for VPR, maybe
that's what this is trying to address? I can't imagine we would ever
need to though since for MVE there is only one VPR register and it is
always 16-bits. Just feels overly complicated to me.
For context, that's what the first version did, and is what led to
the reload failure. The above is trying to explain why returning
1 doesn't work in practice.
To put (2) a slightly different way: if the port says VPR_REGNUM takes
1 register regardless of the mode passed in, the port is effectively
saying that VPR (and thus VPR_REGNUM) has enough bits to hold *any* mode
passed in (SImode, DImode, etc.). It actually makes VPR seem bigger
than a general register.
In the particular case of the reload failure, returning 1 effectively
tells the RA that VPR is big enough to hold SImode, but that the port is
nevertheless choosing not to allow VPR to be used to hold SImode. This
then “infects” the SImode cost of GENERAL_AND_VPR_REGS.
Thanks,
Richard
Ah OK thanks for the explanation.