Re: Enable loop peeling at -O3

Jan Hubicka Mon, 30 May 2016 04:08:28 -0700

> On Sat, 28 May 2016, Jan Hubicka wrote:
> 
> > Hello,
> > thanks for feedback. I updated the patch and also noticed that 
> > -fpeel-all-loops gives up when
> > upper bound is known but it is large and when the max-peel-insns is too 
> > small to permit
> > peeling max-peel-times.  This patch also updates  pr61743-2.c which are now 
> > peeled before
> > we manage to propagate the proper loop bound.
> > 
> > Bootstrapped/regtested x86_64-linux. OK?
> 
> Humm, so why add -fpeel-all-loops?  I don't think -funroll-all-loops
> is useful.
It is mostly there to trigger the transform to see if it is useful. Not 
something
you want to enable by default.


-fpeel-all-loops helps when you know your code have internal loops that
iterate few times.  I.e. one can get good speedup for the sudoku solver 
benchmark
because it has loops that iterate either once or 10 times.

http://www.ucw.cz/~hubicka/papers/amd64/node4.html also claims that 
-funroll-all-loops
improves specint by 2.5%, while -funroll-loops by 2.23%, so it seemed somewhat 
useful
back then.
> 
> Did you check code-size/compile-time/performance effects of enabling
> -fpeel-loops at -O3 for, say, SPEC CPU 2006?

Martin Liska run it on the SPEC2006 and v6 (not with latest fixes to
heuristics).  Without FDO the loop peeling triggers only for loops where we
have likely_upper_bound != upper_bound. We do not predict that too often (we
may in future as there is room for improvement in niter).  The code size effect
was +0.9% for SPECint and +2.2% on SPECfp.  The off-noise improvements were vrp
94.5->89.7 and John the ripper 106.9->100 (less than 0.1% in geomavg).  I have
cut down the code size effects since that (there was a bug that made us to peel
without control when maxiter overflowed), but I did not re-run full specs since
then.

My motivation was mainly to reduce number of optimizations that are not good
enough to be enabled by default and also observation that it helps to some
benchmarks.

We can re-run benchmarks with current patch after fixing the profile update
issues I will send shortly.  No fine-tuning of the parameters was done and I
guess they are still set the way they was set for RTL peeling in 2003.

Honza

Re: Enable loop peeling at -O3

Reply via email to