4.6 regression] unoptimal code for two simple loops

rakdver at gcc dot gnu.org Mon, 17 Jan 2011 03:04:36 -0800

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39838


--- Comment #9 from Zdenek Dvorak <rakdver at gcc dot gnu.org> 2011-01-17 
11:04:22 UTC ---
> UGH.  Everything involving ivtmp.12 is a waste of time.  We really just need 
> to
> realize that D1976_16 is D1971_10 + 4 which avoids all the nonsense with
> ivtmp.12 and I *think* would restore the quality of this code.   I don't know
> enough about the current ivopts code to prototype this and verify that such a
> change would restore the quality of this code.

actually, ivopts know that D1976_16 is D1971_10 + 4; however, since both

D1971_10 = ivtmp.12 - 4;
and
D1971_10 = i << 2;

have the same complexity, it arbitrarily decides to use the latter.  The
problem is in ivopts deciding to create ivtmp.12 at all.  However, this will be
somewhat hard to avoid, since locally, replacing D1976_16 (= i << 2 + 4) by a
new iv is a correct decision (it replaces addition and shift by a single
addition).

Ideally, ivopts should recognize that D1976_16 and D1971_10 are used for memory
addressing and use the appropriate addressing modes ([D.1969_8 + 4*i + 4]
instead of [D.1972_11]).  However, that also fails since ivopts only recognize
the memory references whose addresses are affine ivs (which is not the case
here, as we cannot prove that D.1969_8 is invariant in the loop).

[Bug middle-end/39838] [4.3/4.4/4.5/4.6 regression] unoptimal code for two simple loops

Reply via email to