https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109498
ktkachov at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
Known to work|14.0 |
Status|UNCONFIRMED |NEW
Ever confirmed|0 |1
CC| |ktkachov at gcc dot gnu.org
Last reconfirmed| |2024-07-31
--- Comment #2 from ktkachov at gcc dot gnu.org ---
This does get vectorised with SVE in GCC 14 but not optimally. It doesn't use
the recommended RBIT + CLZ but instead gives:
ctz:
cmp w2, 0
ble .L1
mov x3, 0
whilelo p7.s, wzr, w2
ptrue p6.b, all
.L3:
ld1w z31.s, p7/z, [x1, x3, lsl 2]
movprfx z30, z31
neg z30.s, p6/m, z31.s
and z30.d, z30.d, z31.d
clz z30.s, p6/m, z30.s
subr z30.s, z30.s, #31
st1w z30.s, p7, [x0, x3, lsl 2]
incw x3
whilelo p7.s, w3, w2
b.any .L3
.L1:
ret
with -Ofast -march=armv9-a --param aarch64-autovec-preference=2 .