https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70053

--- Comment #6 from luoxhu at gcc dot gnu.org ---
"-O2 -ftree-slp-vectorize" could also generate the expected simple fmrs.

Reason is pass_cselim will transform conditional stores into unconditional ones
with PHI instructions when vectorization and if-conversion is
enabled(gcc/tree-ssa-phiopt.c:2482).

pr70053.c.108t.cdce:
D256_add_finite (_Decimal128 a, _Decimal128 b, _Decimal128 c)
{
  struct TDx2_t D.2914;

  <bb 2> [local count: 1073741824]:
  if (b_4(D) == c_5(D))
    goto <bb 3>; [34.00%]
  else
    goto <bb 4>; [66.00%]

  <bb 3> [local count: 365072224]:
  D.2914.td0 = c_5(D);
  D.2914.td1 = c_5(D);
  goto <bb 5>; [100.00%]

  <bb 4> [local count: 708669601]:
  D.2914.td0 = a_3(D);
  D.2914.td1 = b_4(D);

  <bb 5> [local count: 1073741824]:
  return D.2914;

}

=> pr70053.c.109t.cselim:

D256_add_finite (_Decimal128 a, _Decimal128 b, _Decimal128 c)
{
  struct TDx2_t D.2914;
  _Decimal128 cstore_10;
  _Decimal128 cstore_11;

  <bb 2> [local count: 1073741824]:
  if (b_4(D) == c_5(D))
    goto <bb 4>; [34.00%]
  else
    goto <bb 3>; [66.00%]

  <bb 3> [local count: 708669601]:

  <bb 4> [local count: 1073741824]:
  # cstore_10 = PHI <c_5(D)(2), a_3(D)(3)>
  # cstore_11 = PHI <c_5(D)(2), b_4(D)(3)>
  D.2914.td1 = cstore_11;
  D.2914.td0 = cstore_10;
  return D.2914;

}

Then at expand pass, the PHI instruction "cstore_10 = PHI <c_5(D)(2),
a_3(D)(3)>" will be expanded to move for "-O2 -ftree-slp-vectorize". If no such
PHI generated, bb3 and bb4 in pr70053.c.108t.cdce will be expanded to
STORE/LOAD with TD->DI conversion, causing a lot st/ld conversion finally.

Reply via email to