http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55731
--- Comment #4 from Yuri Rumyantsev <ysrumyan at gmail dot com> 2012-12-19 09:17:40 UTC --- (In reply to comment #3) > The reason is that unrolling early can be harmful to for example vectorization > and thus cunrolli restricts itself to "obviously" profitable cases. > > In this case the loop is not an "inner" loop - it doesn't have a containing > loop and so growth is not allowed even with -O3 (we otherwise will fail > to vectorize if the unrolled body ends up as part of other basic-blocks). > Richard, It looks that you did not see attached testcases. I can't agree with your statement since 1. Loop in problem (t.c) has only 3 iterations and in any case it should not be considered as candidate for vectorization. 2. Loop contains calls of functions that do not have vectorizable counterparts. 3. Loop contains comparisons with loop control variable as if (i == 0) etc. and cunrolli phase determines it: BB: 7, after_exit: 1 size: 2 if (i_1 == 1) Constant conditional. BB: 5, after_exit: 1 size: 2 foo4 (k_15(D)); size: 2 if (i_1 == 0) Constant conditional. It means that these tests will be completely eliminated by loop unroller and some bb will become unreachable. I also added another testcase (t2.c) for which cunrolli does correct size estimation and completely unroll it (it has only 2 iterations). So I assume that size estimation algorithm in unroller is not perfect and must be re-written. And at last if customer provides gcc with "-funroll-loop" option we should not consider "possible size growth" as reason of unroll rejection. > It's a know issue that after cunroll there is no strong value-numbering > pass that handles memory (there is DOM which only has weak memory handling). > > So, it's a trade-off we make, mostly for the sake of loop optimizations > that do not handle unrolled loops well. Best regards. Yuri.