Kyrylo Tkachov <ktkac...@nvidia.com> writes:
>> On 10 Jul 2025, at 11:12, Kyrylo Tkachov <ktkac...@nvidia.com> wrote:
>> 
>> 
>> 
>>> On 10 Jul 2025, at 10:40, Richard Sandiford <richard.sandif...@arm.com> 
>>> wrote:
>>> 
>>> Kyrylo Tkachov <ktkac...@nvidia.com> writes:
>>>> Hi all,
>>>> 
>>>> While the SVE2 NBSL instruction accepts MOVPRFX to add more flexibility
>>>> due to its tied operands, the destination of the movprfx cannot be also
>>>> a source operand. But the offending pattern in aarch64-sve2.md tries
>>>> to do exactly that for the "=?&w,w,w" alternative and gas warns for the
>>>> attached testcase.
>>>> 
>>>> This patch just removes that alternative causing RA to emit a normal extra
>>>> move.
>>>> So for the testcase in the patch we now generate:
>>>> nor_z:
>>>> nbsl z1.d, z1.d, z2.d, z1.d
>>>> mov z0.d, z1.d
>>>> ret
>>>> 
>>>> instead of the previous:
>>>> nor_z:
>>>> movprfx z0, z1
>>>> nbsl z0.d, z0.d, z2.d, z0.d
>>>> ret
>>>> 
>>>> which generated a gas warning.
>>> 
>>> Shouldn't we instead change it to:
>>> 
>>>    [ ?&w      , w  , w ; yes            ] movprfx\t%0, %1\;nbsl\t%0.d, 
>>> %0.d, %2.d, %1.d
>>> 
>>> ?  The "&" ensures that %1 is still valid in the NBSL.
>>> 
>>> (That's OK if it works.)
>> 
>> Yes, that seems to work, thanks.
>> I’ll push this version after some more testing.
>> 
>
> Shall I backport this for GCC 15.2 as well?
> The test case uses C operators which were enabled in GCC 15, though I suppose 
> one could construct a pure ACLE intrinsics testcase too.

Sounds good to me.  It's fixing wrong code, even if the gas warning
makes it somewhat noisy wrong code.

Thanks,
Richard

Reply via email to