https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88868

            Bug ID: 88868
           Summary: [SSE] pshufb can be omitted for a specific pattern
           Product: gcc
           Version: 8.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: wojciech_mula at poczta dot onet.pl
  Target Milestone: ---

SSSE3 instruction PSHUFB (and the AVX2 counterpart VPSHUFB) acts as a
no-operation
when its argument is a sequence 0..15. Such invocation does not alter shuffled
register, thus PSHUFB can be safely omitted

BTW, clang does this optimization, but ICC doesn't.

---pshufb.c---
#include <immintrin.h>

__m128i shuffle(__m128i x) {
    const __m128i noop = _mm_setr_epi8(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15);
    return _mm_shuffle_epi8(x, noop);
}
---eof---

$ gcc --version
gcc (Debian 8.2.0-13) 8.2.0

$ gcc -O3 -march=skylake -S pshufb.c 
$ cat pshufb.s
shuffle:
        vpshufb .LC0(%rip), %xmm0, %xmm0
        ret
.LC0:
        .byte   0
        .byte   1
        .byte   2
        .byte   3
        .byte   4
        .byte   5
        .byte   6
        .byte   7
        .byte   8
        .byte   9
        .byte   10
        .byte   11
        .byte   12
        .byte   13
        .byte   14
        .byte   15

An expected output:

shuffle:
    ret

Reply via email to