https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116675

            Bug ID: 116675
           Summary: No blend constant permute for V8HImode with just SSE2
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

The ix86_expand_vec_perm_const says a blend of two V8HImode vectors isn't
supported with just SSE2.  The vec_perm_indices is { 0, 9, 2, 11, 4, 13, 6, 15
}
and a fallback should be two bitwise ANDs with a constant mask followed
by a bitwise IOR.

With SSSE3 we get

        pshufb  .LC0(%rip), %xmm0
        pshufb  .LC1(%rip), %xmm1
        por     %xmm1, %xmm0

clang can do

        movaps  .LCPI0_0(%rip), %xmm2           # xmm2 =
[65535,0,65535,0,65535,0,65535,0]
        andps   %xmm2, %xmm0
        andnps  %xmm1, %xmm2
        orps    %xmm2, %xmm0

with just SSE2.

typedef unsigned short v8hi __attribute__((vector_size(16)));

v8hi foo (v8hi a, v8hi b)
{
  return __builtin_shufflevector (a, b, 0, 9, 2, 11, 4, 13, 6, 15);
}

Reply via email to