https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96918
--- Comment #5 from Marc Glisse <glisse at gcc dot gnu.org> ---
typedef unsigned short v8i16 __attribute__((vector_size(16)));
v8i16 bswap_epi16(v8i16 x)
{
return (x << 8) | (x >> 8);
}
We do recognize a rotate already in GENERIC
return x r<< 8;
But this is expanded to
movdqa %xmm0, %xmm1
psrlw $8, %xmm0
psllw $8, %xmm1
por %xmm1, %xmm0
probably the target could advertise a rotate insn for that mode, restricted to
an argument of 8?
IIRC, I didn't use vector extensions for the corresponding shift intrinsics
because for large shift amounts they set the result to 0. But for a constant
scalar, we could lower the builtin to a shift (or fold to 0).