https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117093

--- Comment #3 from ktkachov at gcc dot gnu.org ---
I think it's the VIEW_CONVERT_EXPR that are hurting us (more complete dump
before expand):
  _1 = VIEW_CONVERT_EXPR<uint32x4_t>(r_3(D));
  t_4 = BIT_FIELD_REF <r_3(D), 32, 0>;
  a_5 = VEC_PERM_EXPR <_1, _1, { 1, 1, 2, 3 }>;
  a_6 = BIT_INSERT_EXPR <a_5, t_4, 32 (32 bits)>;
  t_7 = BIT_FIELD_REF <r_3(D), 32, 64>;
  _2 = BIT_FIELD_REF <r_3(D), 32, 96>;
  a_8 = BIT_INSERT_EXPR <a_6, _2, 64 (32 bits)>;
  a_9 = BIT_INSERT_EXPR <a_8, t_7, 96 (32 bits)>;
  _10 = VIEW_CONVERT_EXPR<uint64x2_t>(a_9);
  return _10;

If we remove the casts:
uint32x4_t ror32_neon_tgt_gcc_bad(uint32x4_t r) {
    uint32x4_t a = r;
    uint32_t t;
    t = a[0]; a[0] = a[1]; a[1] = t;
    t = a[2]; a[2] = a[3]; a[3] = t;
    return a;
}
Then this is successfully recognised as:
  a_2 = VEC_PERM_EXPR <r_1(D), r_1(D), { 1, 0, 3, 2 }>;

Reply via email to