void bar(void); void foo(int ie, int je) { int i, j; for (j=0; j<je; ++j) for (i=0; i<ie; ++i) bar(); }
should _not_ be transformed to foo (ie, je) { int j; int i; <bb 0>: if (je > 0) goto <L23>; else goto <L5>; <L23>:; j = 0; goto <bb 3> (<L2>); <L22>:; i = 0; <L1>:; bar (); i = i + 1; if (ie != i) goto <L1>; else goto <L3>; <L3>:; j = j + 1; if (je != j) goto <L2>; else goto <L5>; <L2>:; if (ie > 0) goto <L22>; else goto <L3>; <L5>:; return; } i.e. containing an loop-invariant check if (ie > 0). Both DOM and copy-header do this transformation. Disabling both we get ;; Function foo (foo) foo (ie, je) { int j; int i; <bb 0>: j = 0; goto <bb 4> (<L4>); <L1>:; bar (); i = i + 1; <L2>:; if (i < ie) goto <L1>; else goto <L3>; <L3>:; j = j + 1; <L4>:; if (j < je) goto <L8>; else goto <L5>; <L8>:; i = 0; goto <bb 2> (<L2>); <L5>:; return; } which is a _lot_ faster for small ie. Optimally we would hoist the loop invariant check out of the j loop. -- Summary: Should not do loop header copying for inner loop Product: gcc Version: 4.1.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P2 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: rguenth at gcc dot gnu dot org CC: gcc-bugs at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23855