https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96918
--- Comment #2 from Gabriel Ravier <gabravier at gmail dot com> --- Yes, you can reproduce this with _mm_shuffle_epi8, _mm_slli_epi16 and _mm_srli_epi16. I'm assuming GCC developers are more familiar with the internal intrinsics than with the Intel-provided intrinsics, so I didn't bother converting it to Intel intrinsics, is that wrong ?