> On 10 Jul 2025, at 11:12, Kyrylo Tkachov <ktkac...@nvidia.com> wrote:
> 
> 
> 
>> On 10 Jul 2025, at 10:40, Richard Sandiford <richard.sandif...@arm.com> 
>> wrote:
>> 
>> Kyrylo Tkachov <ktkac...@nvidia.com> writes:
>>> Hi all,
>>> 
>>> While the SVE2 NBSL instruction accepts MOVPRFX to add more flexibility
>>> due to its tied operands, the destination of the movprfx cannot be also
>>> a source operand. But the offending pattern in aarch64-sve2.md tries
>>> to do exactly that for the "=?&w,w,w" alternative and gas warns for the
>>> attached testcase.
>>> 
>>> This patch just removes that alternative causing RA to emit a normal extra
>>> move.
>>> So for the testcase in the patch we now generate:
>>> nor_z:
>>> nbsl z1.d, z1.d, z2.d, z1.d
>>> mov z0.d, z1.d
>>> ret
>>> 
>>> instead of the previous:
>>> nor_z:
>>> movprfx z0, z1
>>> nbsl z0.d, z0.d, z2.d, z0.d
>>> ret
>>> 
>>> which generated a gas warning.
>> 
>> Shouldn't we instead change it to:
>> 
>>    [ ?&w      , w  , w ; yes            ] movprfx\t%0, %1\;nbsl\t%0.d, %0.d, 
>> %2.d, %1.d
>> 
>> ?  The "&" ensures that %1 is still valid in the NBSL.
>> 
>> (That's OK if it works.)
> 
> Yes, that seems to work, thanks.
> I’ll push this version after some more testing.
> 

Shall I backport this for GCC 15.2 as well?
The test case uses C operators which were enabled in GCC 15, though I suppose 
one could construct a pure ACLE intrinsics testcase too.

Thanks,
Kyrill 

> Kyrill
> 
>> 
>> Thanks,
>> Richard
>> 
>>> Bootstrapped and tested on aarch64-none-linux-gnu.
>>> Ok for trunk?
>>> Do we want to backport it?
>>> 
>>> Thanks,
>>> Kyrill
>>> 
>>> 
>>> Signed-off-by: Kyrylo Tkachov <ktkac...@nvidia.com>
>>> 
>>> gcc/
>>> 
>>> PR target/120999
>>> * config/aarch64/aarch64-sve2.md (*aarch64_sve2_nor<mode>):
>>> Remove movprfx alternative.
>>> 
>>> gcc/testsuite/
>>> 
>>> PR target/120999
>>> * gcc.target/aarch64/sve2/pr120999.c: New test.
>>> 
>>> From bd24ce298461ee8129befda1983acf1b37a7215a Mon Sep 17 00:00:00 2001
>>> From: Kyrylo Tkachov <ktkac...@nvidia.com>
>>> Date: Wed, 9 Jul 2025 10:04:01 -0700
>>> Subject: [PATCH] aarch64: PR target/120999: Avoid movprfx for NBSL
>>> implementation of NOR
>>> 
>>> While the SVE2 NBSL instruction accepts MOVPRFX to add more flexibility
>>> due to its tied operands, the destination of the movprfx cannot be also
>>> a source operand.  But the offending pattern in aarch64-sve2.md tries
>>> to do exactly that for the "=?&w,w,w" alternative and gas warns for the
>>> attached testcase.
>>> 
>>> This patch just removes that alternative causing RA to emit a normal extra
>>> move.
>>> So for the testcase in the patch we now generate:
>>> nor_z:
>>>       nbsl    z1.d, z1.d, z2.d, z1.d
>>>       mov     z0.d, z1.d
>>>       ret
>>> 
>>> instead of the previous:
>>> nor_z:
>>>       movprfx z0, z1
>>>       nbsl    z0.d, z0.d, z2.d, z0.d
>>>       ret
>>> 
>>> which generated a gas warning.
>>> 
>>> Bootstrapped and tested on aarch64-none-linux-gnu.
>>> 
>>> Signed-off-by: Kyrylo Tkachov <ktkac...@nvidia.com>
>>> 
>>> gcc/
>>> 
>>> PR target/120999
>>> * config/aarch64/aarch64-sve2.md (*aarch64_sve2_nor<mode>):
>>> Remove movprfx alternative.
>>> 
>>> gcc/testsuite/
>>> 
>>> PR target/120999
>>> * gcc.target/aarch64/sve2/pr120999.c: New test.
>>> ---
>>> gcc/config/aarch64/aarch64-sve2.md               | 11 ++++-------
>>> gcc/testsuite/gcc.target/aarch64/sve2/pr120999.c | 15 +++++++++++++++
>>> 2 files changed, 19 insertions(+), 7 deletions(-)
>>> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/pr120999.c
>>> 
>>> diff --git a/gcc/config/aarch64/aarch64-sve2.md 
>>> b/gcc/config/aarch64/aarch64-sve2.md
>>> index 15714712d3b..504ba6fc39b 100644
>>> --- a/gcc/config/aarch64/aarch64-sve2.md
>>> +++ b/gcc/config/aarch64/aarch64-sve2.md
>>> @@ -1616,20 +1616,17 @@
>>> 
>>> ;; Use NBSL for vector NOR.
>>> (define_insn_and_rewrite "*aarch64_sve2_nor<mode>"
>>> -  [(set (match_operand:SVE_FULL_I 0 "register_operand")
>>> +  [(set (match_operand:SVE_FULL_I 0 "register_operand" "=w")
>>> (unspec:SVE_FULL_I
>>>  [(match_operand 3)
>>>   (and:SVE_FULL_I
>>>     (not:SVE_FULL_I
>>> -        (match_operand:SVE_FULL_I 1 "register_operand"))
>>> +        (match_operand:SVE_FULL_I 1 "register_operand" "%0"))
>>>     (not:SVE_FULL_I
>>> -        (match_operand:SVE_FULL_I 2 "register_operand")))]
>>> +        (match_operand:SVE_FULL_I 2 "register_operand" "w")))]
>>>  UNSPEC_PRED_X))]
>>>  "TARGET_SVE2"
>>> -  {@ [ cons: =0 , %1 , 2 ; attrs: movprfx ]
>>> -     [ w        , 0  , w ; *              ] nbsl\t%0.d, %0.d, %2.d, %0.d
>>> -     [ ?&w      , w  , w ; yes            ] movprfx\t%0, %1\;nbsl\t%0.d, 
>>> %0.d, %2.d, %0.d
>>> -  }
>>> +  "nbsl\t%0.d, %0.d, %2.d, %0.d"
>>>  "&& !CONSTANT_P (operands[3])"
>>>  {
>>>    operands[3] = CONSTM1_RTX (<VPRED>mode);
>>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pr120999.c 
>>> b/gcc/testsuite/gcc.target/aarch64/sve2/pr120999.c
>>> new file mode 100644
>>> index 00000000000..1cdfa4107ae
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pr120999.c
>>> @@ -0,0 +1,15 @@
>>> +/* PR target/120999.  */
>>> +/* { dg-do assemble } */
>>> +/* { dg-options "-O2 --save-temps" } */
>>> +
>>> +/* We shouldn't be generating a MOVPRFX form of NBSL here or we'll get
>>> +   an assembler warning.  */
>>> +
>>> +#include <arm_sve.h>
>>> +
>>> +#define NOR(x, y)   (~((x) | (y)))
>>> +
>>> +svuint64_t nor_z(svuint64_t c, svuint64_t a, svuint64_t b) { return NOR(a, 
>>> b); }
>>> +
>>> +/* { dg-final { scan-assembler-not {\tmovprfx} } } */
>>> +
> 
> 
> <0001-aarch64-PR-target-120999-Adjust-operands-for-movprfx.patch>

Reply via email to