http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39838
--- Comment #9 from Zdenek Dvorak <rakdver at gcc dot gnu.org> 2011-01-17 11:04:22 UTC --- > UGH. Everything involving ivtmp.12 is a waste of time. We really just need > to > realize that D1976_16 is D1971_10 + 4 which avoids all the nonsense with > ivtmp.12 and I *think* would restore the quality of this code. I don't know > enough about the current ivopts code to prototype this and verify that such a > change would restore the quality of this code. actually, ivopts know that D1976_16 is D1971_10 + 4; however, since both D1971_10 = ivtmp.12 - 4; and D1971_10 = i << 2; have the same complexity, it arbitrarily decides to use the latter. The problem is in ivopts deciding to create ivtmp.12 at all. However, this will be somewhat hard to avoid, since locally, replacing D1976_16 (= i << 2 + 4) by a new iv is a correct decision (it replaces addition and shift by a single addition). Ideally, ivopts should recognize that D1976_16 and D1971_10 are used for memory addressing and use the appropriate addressing modes ([D.1969_8 + 4*i + 4] instead of [D.1972_11]). However, that also fails since ivopts only recognize the memory references whose addresses are affine ivs (which is not the case here, as we cannot prove that D.1969_8 is invariant in the loop).