http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48732
--- Comment #2 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-04-26 12:13:50 UTC --- With -O1 we do not perform expensive control-dependend DCE and thus do not end up removing the empty loops. We do however remove the dead store in the innermost loop which then causes us to compute that completely unrolling all loops is profitable (we basically see it's all dead code that will be produced). Now, unfortunately before removing all that dead code we perform re-association on the induction variable increment chains ... of which there are a lot (8 ^ n ones and more, actually). We've known for quite some time that not doing constant propagation and dead code elimination on the unrolled loop bodies isn't the best idea (induction variable analysis is also pessimized by not doing CSE on those). The only CCP-like pass after loop opts is VRP which does not run at -O2, or DOM and both runs after re-assoc (though I don't see a particularly good reason for that fact). Scheduling DOM right after loop opts "fixes" this. But a more proper fix would be to do cleanups closer to unrolling.