https://gcc.gnu.org/bugzilla/show_bug.cgi?id=35545
--- Comment #22 from Jan Hubicka <hubicka at ucw dot cz> --- > > Doing it at same approximately the same place as loop header copying seems > > to > > make most sense to me. It benefits from early cleanups and DCE definitly > > and > > it should enable more fun with the later scalar passes that are almost all > > rerun then. > > We need to make sure tracer doesn't mess too much with loops then. > Btw, "useless" tracing may be undone again by tail-merging. Tracer already does: /* We have the tendency to duplicate the loop header of all do { } while loops. Do not do that - it is not profitable and it might create a loop with multiple entries or at least rotate the loop. */ && bb2->loop_father->header != bb2) so it won't kill natural loops and peel (I should find time to update the peeling pass). It also has: if (bb_seen_p (bb2) || (e->flags & (EDGE_DFS_BACK | EDGE_COMPLEX)) || find_best_successor (bb2) != e) break; to not unroll. So i think it is safe for loop optimizer - all it can do is expanding a loop that has control flow in it reducing its unrollability later. > > Tracer seems to consume only profile information and thus doesn't > rely on any other transforms (well, apart from cleanups which could > affect its cost function). Why not schedule it even earlier? > Like to before pass_build_alias? (the pipeline up to loop transforms > is quite a mess...) It uses profile information and code size estiamtes. I expect that (especially for C++ stuff) the early post inline passes will remove a lot of code and thus improve traceability. This is why I looked for spot after first DCE/DSE. David, can you please be more specific about how you tested? Was it with profile feedback? What about code size metrics? Honza