> On Sat, 28 May 2016, Jan Hubicka wrote: > > > Hello, > > thanks for feedback. I updated the patch and also noticed that > > -fpeel-all-loops gives up when > > upper bound is known but it is large and when the max-peel-insns is too > > small to permit > > peeling max-peel-times. This patch also updates pr61743-2.c which are now > > peeled before > > we manage to propagate the proper loop bound. > > > > Bootstrapped/regtested x86_64-linux. OK? > > Humm, so why add -fpeel-all-loops? I don't think -funroll-all-loops > is useful. It is mostly there to trigger the transform to see if it is useful. Not something you want to enable by default.
-fpeel-all-loops helps when you know your code have internal loops that iterate few times. I.e. one can get good speedup for the sudoku solver benchmark because it has loops that iterate either once or 10 times. http://www.ucw.cz/~hubicka/papers/amd64/node4.html also claims that -funroll-all-loops improves specint by 2.5%, while -funroll-loops by 2.23%, so it seemed somewhat useful back then. > > Did you check code-size/compile-time/performance effects of enabling > -fpeel-loops at -O3 for, say, SPEC CPU 2006? Martin Liska run it on the SPEC2006 and v6 (not with latest fixes to heuristics). Without FDO the loop peeling triggers only for loops where we have likely_upper_bound != upper_bound. We do not predict that too often (we may in future as there is room for improvement in niter). The code size effect was +0.9% for SPECint and +2.2% on SPECfp. The off-noise improvements were vrp 94.5->89.7 and John the ripper 106.9->100 (less than 0.1% in geomavg). I have cut down the code size effects since that (there was a bug that made us to peel without control when maxiter overflowed), but I did not re-run full specs since then. My motivation was mainly to reduce number of optimizations that are not good enough to be enabled by default and also observation that it helps to some benchmarks. We can re-run benchmarks with current patch after fixing the profile update issues I will send shortly. No fine-tuning of the parameters was done and I guess they are still set the way they was set for RTL peeling in 2003. Honza