On Mon, 16 Feb 2015, Richard Biener wrote: > > Predictive commoning happens to re-use SSA names it released while > there are still uses of them (oops), confusing the hell out of > other code (expected). Fixed thus. > > Bootstrap and regtest running on x86_64-unknown-linux-gnu.
So I was wrong in that this doesn't fix PR65063 but it pointed at a similar issue. The loop transform code doesn't handle the case where we replace looparound PHIs and need a epilogue loop (thus we use unrolling). The following patch disables unrolling in that case. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. 2015-02-17 Richard Biener <rguent...@suse.de> PR tree-optimization/65063 * tree-predcom.c (determine_unroll_factor): Return 1 if we have replaced looparound PHIs. * gcc.dg/pr65063.c: New testcase. Index: gcc/tree-predcom.c =================================================================== *** gcc/tree-predcom.c (revision 220755) --- gcc/tree-predcom.c (working copy) *************** determine_unroll_factor (vec<chain_p> ch *** 1775,1783 **** FOR_EACH_VEC_ELT (chains, i, chain) { ! if (chain->type == CT_INVARIANT || chain->combined) continue; /* The best unroll factor for this chain is equal to the number of temporary variables that we create for it. */ af = chain->length; --- 1775,1794 ---- FOR_EACH_VEC_ELT (chains, i, chain) { ! if (chain->type == CT_INVARIANT) continue; + if (chain->combined) + { + /* For combined chains, we can't handle unrolling if we replace + looparound PHIs. */ + dref a; + unsigned j; + for (j = 1; chain->refs.iterate (j, &a); j++) + if (gimple_code (a->stmt) == GIMPLE_PHI) + return 1; + } + /* The best unroll factor for this chain is equal to the number of temporary variables that we create for it. */ af = chain->length; Index: gcc/testsuite/gcc.dg/pr65063.c =================================================================== *** gcc/testsuite/gcc.dg/pr65063.c (revision 0) --- gcc/testsuite/gcc.dg/pr65063.c (working copy) *************** *** 0 **** --- 1,33 ---- + /* { dg-do run } */ + /* { dg-options "-O3 -fno-tree-loop-ivcanon -fno-tree-vectorize" } */ + + static int in[8][4]; + static int out[4]; + static const int check_result[] = {0, 16, 256, 4096}; + + static inline void foo () + { + int sum; + int i, j, k; + for (k = 0; k < 4; k++) + { + sum = 1; + for (j = 0; j < 4; j++) + for (i = 0; i < 4; i++) + sum *= in[i + k][j]; + out[k] = sum; + } + } + + int main () + { + int i, j, k; + for (i = 0; i < 8; i++) + for (j = 0; j < 4; j++) + in[i][j] = (i + 2) / 3; + foo (); + for (k = 0; k < 4; k++) + if (out[k] != check_result[k]) + __builtin_abort (); + return 0; + }