https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63330
Bug ID: 63330 Summary: vector shuffle resembling vector shift not expanded optimally Product: gcc Version: 4.9.1 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target: x86_64-*-*, i?86-*-* typedef int v4si __attribute__((vector_size(16))); v4si foo (v4si x) { return __builtin_shuffle (x, (v4si){ 0, 0, 0, 0 }, (v4si){4, 3, 2, 1 }); } and similar shuffles "shifting" a vector by whole-element amounts left/right and inserting zeros are not expanded optimally (while the target has at least a vec_shr optab which suggests sth would be available). With -mavx2 I get for the above vpxor %xmm1, %xmm1, %xmm1 vpalignr $4, %xmm0, %xmm1, %xmm0 vpshufd $27, %xmm0, %xmm0 while I expected sth like psrldq %xmm0, $4 __builtin_shuffle (x, (v4si) { -1, -1, -1, -1 }, ... ) Arbitrary constants "shifted in" could be handled the same by IORing the shifted in value after the psrldq in the appropriate elements for the cost of an extra vector constant.