https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71437

--- Comment #6 from amker at gcc dot gnu.org ---
Hmm, The input to IVOPT is different w/o the change:
Before r235817, it's like:

  <bb 8>:
  # i_153 = PHI <0(7), i_19(12)>
  i.1_13 = (sizetype) i_153;
  _14 = i.1_13 + 1;
  _15 = _14 * 4;
  _16 = pretmp_509 + _15;
  _17 = *_16;
  if (_17 > pretmp_506)
    goto <bb 9>;
  else
    goto <bb 11>;

  <bb 9>:
  _20 = i.1_13 * 4;
  _21 = pretmp_509 + _20;
  _22 = *_21;
  if (_22 <= pretmp_506)
    goto <bb 10>;
  else
    goto <bb 11>;

  <bb 10>:
  # _590 = PHI <_17(9)>
  # _588 = PHI <_22(9)>
  # i_583 = PHI <i_153(9)>
  _646 = i_583 + 1;
  goto <bb 17>;

  <bb 11>:
  i_19 = i_153 + 1;
  if (i_19 < _79)
    goto <bb 12>;
  else
    goto <bb 13>;

  <bb 12>:
  goto <bb 8>;

After the change, it becomes:
  <bb 8>:
  # i_136 = PHI <0(7), i_123(12)>
  i.1_2 = (sizetype) i_136;
  _3 = i.1_2 + 1;
  _4 = _3 * 4;
  _5 = pretmp_291 + _4;
  _6 = *_5;
  if (_6 > pretmp_289)
    goto <bb 9>;
  else
    goto <bb 11>;

  <bb 9>:
  _8 = i.1_2 * 4;
  _9 = pretmp_291 + _8;
  _10 = *_9;
  if (_10 <= pretmp_289)
    goto <bb 10>;
  else
    goto <bb 11>;

  <bb 10>:
  # _399 = PHI <_5(9)>
  # _398 = PHI <_9(9)>
  # i_393 = PHI <i_136(9)>
  _318 = i_393 + 1;
  goto <bb 14>;

  <bb 11>:
  i_123 = i_136 + 1;
  if (i_123 < _144)
    goto <bb 12>;
  else
    goto <bb 13>;

  <bb 12>:
  goto <bb 8>;

The major difference is loop closed PHI _590/_399 and _588/_398.  Before
change, it is the result of load that is PREed, but after change, it's the
address of load that is PREed.  So we need to load again outside of loop,
rather than reuse the load in the last iteration of loop.  Maybe this is the
reason for regression.  PRE missed this in the first place, but again, it might
be caused by different input to PRE.  The first difference comes from vrp/dce
pass.

As for different IVOPT decisions, it's doing the right thing with the current
input.  With a outside loop use of memory address, it can't use [base + index
<< step + offset] addressing mode because that means we need to compute the
address from scratch again after loop.

Thanks.

Reply via email to