haw
>>> ; Alex Coplan ; Andrew
>>> Pinski
>>> Subject: [PATCH] aarch64: Improve popcountti2 with SVE
>>>
>>> Hi all,
>>>
>>> The TImode popcount sequence can be slightly improved with SVE.
>>> If we generate:
>>> ld
Tamar Christina writes:
>> -Original Message-
>> From: Kyrylo Tkachov
>> Sent: Monday, July 7, 2025 10:38 AM
>> To: GCC Patches
>> Cc: Richard Sandiford ; Richard Earnshaw
>> ; Alex Coplan ; Andrew
>> Pinski
>> Subject: [PATCH] aar
> -Original Message-
> From: Kyrylo Tkachov
> Sent: Monday, July 7, 2025 10:38 AM
> To: GCC Patches
> Cc: Richard Sandiford ; Richard Earnshaw
> ; Alex Coplan ; Andrew
> Pinski
> Subject: [PATCH] aarch64: Improve popcountti2 with SVE
>
> Hi all,
>
>
Hi all,
The TImode popcount sequence can be slightly improved with SVE.
If we generate:
ldr q31, [x0]
ptrue p7.b, vl16
cnt z31.d, p7/m, z31.d
addp d31, v31.2d
fmov x0, d31
ret
instead of:
h128:
ldr q31, [x0]
cnt v31.16b, v31.16b
addv b31, v31.16b
fmov w0, s31
ret
we use the ADDP instruction for