https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91940
--- Comment #7 from Marc Glisse <glisse at gcc dot gnu.org> --- (In reply to Jakub Jelinek from comment #1) > The loop with the rotate is vectorized, while the one with __builtin_bswap16 > is not. It is a bit surprising that we do not canonicalize one to the other somewhere in the middle-end.