https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88440
--- Comment #8 from rguenther at suse dot de <rguenther at suse dot de> --- On Thu, 16 May 2019, marxin at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88440 > > --- Comment #7 from Martin Liška <marxin at gcc dot gnu.org> --- > (In reply to Richard Biener from comment #6) > > Created attachment 45313 [details] > > patch > > > > This enables distribution of patterns at -O[2s]+ and optimizes the testcase > > at -Os by adjusting the guards in loop distribution. > > > > Note that the interesting bits are compile-time, binary-size and performance > > at mainly -O2, eventually size at -Os. > > > > I suspect that at -O2 w/o profiling most loops would be > > optimize_loop_for_speed > > anyways so changing the heuristics isn't so bad but of course enabling > > distribution at -O2 might encour a penalty. > > I have so far build numbers on a Zen machine with -j16: ... > There's only one difference: > > 521.wrf_r: 310 -> 346s Ick. I currently see no limiting on the size of loops in loop distribution, one easy would be to limit the worklist size in find_seed_stmts_for_distribution with a --param we can lower at -O[2s], another thing would be to limit loop nest depth similarly. A profile might be interesting here as well...