https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71437
--- Comment #6 from amker at gcc dot gnu.org --- Hmm, The input to IVOPT is different w/o the change: Before r235817, it's like: <bb 8>: # i_153 = PHI <0(7), i_19(12)> i.1_13 = (sizetype) i_153; _14 = i.1_13 + 1; _15 = _14 * 4; _16 = pretmp_509 + _15; _17 = *_16; if (_17 > pretmp_506) goto <bb 9>; else goto <bb 11>; <bb 9>: _20 = i.1_13 * 4; _21 = pretmp_509 + _20; _22 = *_21; if (_22 <= pretmp_506) goto <bb 10>; else goto <bb 11>; <bb 10>: # _590 = PHI <_17(9)> # _588 = PHI <_22(9)> # i_583 = PHI <i_153(9)> _646 = i_583 + 1; goto <bb 17>; <bb 11>: i_19 = i_153 + 1; if (i_19 < _79) goto <bb 12>; else goto <bb 13>; <bb 12>: goto <bb 8>; After the change, it becomes: <bb 8>: # i_136 = PHI <0(7), i_123(12)> i.1_2 = (sizetype) i_136; _3 = i.1_2 + 1; _4 = _3 * 4; _5 = pretmp_291 + _4; _6 = *_5; if (_6 > pretmp_289) goto <bb 9>; else goto <bb 11>; <bb 9>: _8 = i.1_2 * 4; _9 = pretmp_291 + _8; _10 = *_9; if (_10 <= pretmp_289) goto <bb 10>; else goto <bb 11>; <bb 10>: # _399 = PHI <_5(9)> # _398 = PHI <_9(9)> # i_393 = PHI <i_136(9)> _318 = i_393 + 1; goto <bb 14>; <bb 11>: i_123 = i_136 + 1; if (i_123 < _144) goto <bb 12>; else goto <bb 13>; <bb 12>: goto <bb 8>; The major difference is loop closed PHI _590/_399 and _588/_398. Before change, it is the result of load that is PREed, but after change, it's the address of load that is PREed. So we need to load again outside of loop, rather than reuse the load in the last iteration of loop. Maybe this is the reason for regression. PRE missed this in the first place, but again, it might be caused by different input to PRE. The first difference comes from vrp/dce pass. As for different IVOPT decisions, it's doing the right thing with the current input. With a outside loop use of memory address, it can't use [base + index << step + offset] addressing mode because that means we need to compute the address from scratch again after loop. Thanks.