https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102652
ktkachov at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |ktkachov at gcc dot gnu.org
Status|UNCONFIRMED |NEW
Ever confirmed|0 |1
Last reconfirmed| |2021-10-08
--- Comment #1 from ktkachov at gcc dot gnu.org ---
Confirmed on the GCC 11 release. There is an active effort to improve the code
generation for these intrinsics and current trunk produces:
bug:
ldr q5, [x1]
sshr v4.16b, v5.16b, 7
mov v0.16b, v5.16b
mov v1.16b, v4.16b
mov v2.16b, v4.16b
mov v3.16b, v4.16b
st4 {v0.16b - v3.16b}, [x0], 64
ldr q4, [x1, 16]
mov v0.16b, v4.16b
sshr v4.16b, v4.16b, 7
mov v1.16b, v4.16b
mov v2.16b, v4.16b
mov v3.16b, v4.16b
st4 {v0.16b - v3.16b}, [x0]
ret
Not optimal yet, but moving in the right direction