https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88868
Bug ID: 88868 Summary: [SSE] pshufb can be omitted for a specific pattern Product: gcc Version: 8.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: wojciech_mula at poczta dot onet.pl Target Milestone: --- SSSE3 instruction PSHUFB (and the AVX2 counterpart VPSHUFB) acts as a no-operation when its argument is a sequence 0..15. Such invocation does not alter shuffled register, thus PSHUFB can be safely omitted BTW, clang does this optimization, but ICC doesn't. ---pshufb.c--- #include <immintrin.h> __m128i shuffle(__m128i x) { const __m128i noop = _mm_setr_epi8(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); return _mm_shuffle_epi8(x, noop); } ---eof--- $ gcc --version gcc (Debian 8.2.0-13) 8.2.0 $ gcc -O3 -march=skylake -S pshufb.c $ cat pshufb.s shuffle: vpshufb .LC0(%rip), %xmm0, %xmm0 ret .LC0: .byte 0 .byte 1 .byte 2 .byte 3 .byte 4 .byte 5 .byte 6 .byte 7 .byte 8 .byte 9 .byte 10 .byte 11 .byte 12 .byte 13 .byte 14 .byte 15 An expected output: shuffle: ret