Kyrylo Tkachov <ktkac...@nvidia.com> writes: >> On 10 Jul 2025, at 11:12, Kyrylo Tkachov <ktkac...@nvidia.com> wrote: >> >> >> >>> On 10 Jul 2025, at 10:40, Richard Sandiford <richard.sandif...@arm.com> >>> wrote: >>> >>> Kyrylo Tkachov <ktkac...@nvidia.com> writes: >>>> Hi all, >>>> >>>> While the SVE2 NBSL instruction accepts MOVPRFX to add more flexibility >>>> due to its tied operands, the destination of the movprfx cannot be also >>>> a source operand. But the offending pattern in aarch64-sve2.md tries >>>> to do exactly that for the "=?&w,w,w" alternative and gas warns for the >>>> attached testcase. >>>> >>>> This patch just removes that alternative causing RA to emit a normal extra >>>> move. >>>> So for the testcase in the patch we now generate: >>>> nor_z: >>>> nbsl z1.d, z1.d, z2.d, z1.d >>>> mov z0.d, z1.d >>>> ret >>>> >>>> instead of the previous: >>>> nor_z: >>>> movprfx z0, z1 >>>> nbsl z0.d, z0.d, z2.d, z0.d >>>> ret >>>> >>>> which generated a gas warning. >>> >>> Shouldn't we instead change it to: >>> >>> [ ?&w , w , w ; yes ] movprfx\t%0, %1\;nbsl\t%0.d, >>> %0.d, %2.d, %1.d >>> >>> ? The "&" ensures that %1 is still valid in the NBSL. >>> >>> (That's OK if it works.) >> >> Yes, that seems to work, thanks. >> I’ll push this version after some more testing. >> > > Shall I backport this for GCC 15.2 as well? > The test case uses C operators which were enabled in GCC 15, though I suppose > one could construct a pure ACLE intrinsics testcase too.
Sounds good to me. It's fixing wrong code, even if the gas warning makes it somewhat noisy wrong code. Thanks, Richard