https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79938

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement
   Last reconfirmed|                            |2021-08-02
             Status|UNCONFIRMED                 |NEW
          Component|target                      |tree-optimization
     Ever confirmed|0                           |1

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I think we could be better than what we producing on the trunk:
 _1 = BIT_FIELD_REF <a_49(D), 8, 0>;
  _3 = BIT_FIELD_REF <a_49(D), 8, 8>;
  _6 = BIT_FIELD_REF <a_49(D), 8, 16>;
  _8 = BIT_FIELD_REF <a_49(D), 8, 24>;
  _13 = BIT_FIELD_REF <a_49(D), 8, 32>;
  _15 = BIT_FIELD_REF <a_49(D), 8, 40>;
  _18 = BIT_FIELD_REF <a_49(D), 8, 48>;
  _20 = BIT_FIELD_REF <a_49(D), 8, 56>;
  _25 = BIT_FIELD_REF <a_49(D), 8, 64>;
  _27 = BIT_FIELD_REF <a_49(D), 8, 72>;
  _30 = BIT_FIELD_REF <a_49(D), 8, 80>;
  _32 = BIT_FIELD_REF <a_49(D), 8, 88>;
  _37 = BIT_FIELD_REF <a_49(D), 8, 96>;
  _63 = {_1, _13, _25, _37};
  vect__2.10_22 = (vector(4) int) _63;
  _39 = BIT_FIELD_REF <a_49(D), 8, 104>;
  _29 = {_3, _15, _27, _39};
  vect__4.11_60 = (vector(4) int) _29;
  _69 = vect__2.10_22 + vect__4.11_60;
  _42 = BIT_FIELD_REF <a_49(D), 8, 112>;
  _10 = {_6, _18, _30, _42};
  vect__7.9_17 = (vector(4) int) _10;
  _44 = BIT_FIELD_REF <a_49(D), 8, 120>;
  _5 = {_8, _20, _32, _44};
  vect__9.8_66 = (vector(4) int) _5;
  _70 = vect__7.9_17 + vect__9.8_66;
  vect__11.14_57 = _69 + _70;
  _55 = VIEW_CONVERT_EXPR<__m128i>(vect__11.14_57);

------ CUT ----
We could produce a shuffle from a_49(D) and then do extracts to get _63, _29,
_10, and _5.

clang does:
        pshufb  .LCPI1_0(%rip), %xmm2           # xmm2 =
xmm2[0],zero,zero,zero,xmm2[1],zero,zero,zero,xmm2[3],zero,zero,zero,xmm2[2],zero,zero,zero
        pshufd  $238, %xmm2, %xmm0

Reply via email to