https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63330

            Bug ID: 63330
           Summary: vector shuffle resembling vector shift not expanded
                    optimally
           Product: gcc
           Version: 4.9.1
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
            Target: x86_64-*-*, i?86-*-*

typedef int v4si __attribute__((vector_size(16)));
v4si foo (v4si x)
{
  return __builtin_shuffle (x, (v4si){ 0, 0, 0, 0 },
                             (v4si){4, 3, 2, 1 });
}

and similar shuffles "shifting" a vector by whole-element amounts
left/right and inserting zeros are not expanded optimally (while
the target has at least a vec_shr optab which suggests sth would
be available).

With -mavx2 I get for the above

        vpxor   %xmm1, %xmm1, %xmm1
        vpalignr        $4, %xmm0, %xmm1, %xmm0
        vpshufd $27, %xmm0, %xmm0

while I expected sth like

        psrldq %xmm0, $4

__builtin_shuffle (x, (v4si) { -1, -1, -1, -1 }, ... )

Arbitrary constants "shifted in" could be handled the same by IORing
the shifted in value after the psrldq in the appropriate elements
for the cost of an extra vector constant.

Reply via email to