http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57290

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
I'm trying to reproduce it.  Can you on your side verify whether dropping
-ftree-loop-linear changes anything with respect to the regression?
Also what does

(6) -Ofast -funroll-loops -fwhole-program

numbers look like?  Because if you factor in LTO then you should compare
against a revision that includes

2013-04-26  Richard Biener  <rguent...@suse.de>

        * Makefile.in (lto-streamer-in.o): Add $(CFGLOOP_H) dependency.
        (lto-streamer-out.o): Likewise.
        * cfgloop.c (init_loops_structure): Export, add struct function
        argument and adjust.
        (flow_loops_find): Adjust.
        * cfgloop.h (enum loop_estimation): Add EST_LAST.
        (init_loops_structure): Declare.
        * lto-streamer-in.c: Include cfgloop.h.
        (input_cfg): Input the loop tree.
        * lto-streamer-out.c: Include cfgloop.h.
        (output_cfg): Output the loop tree.
        (output_struct_function_base): Do not drop PROP_loops.

I see

(1) -Ofast -funroll-loops -fomit-frame-pointer -fwhole-program -flto
(2) -Ofast -funroll-loops -fomit-frame-pointer -fwhole-program -flto
-fprotect-parens

revision:    198332     198333
(1)          15.5+-.3   15.6+-.2
(2)          16.1+-.1   15.9+-.2

note that the PAREN_EXPR thing made me point at the extra copyprop pass.
So there is a difference between -f[no-]protect-parens but between the revs
I cannot see a regression.

Are you testing 64bit or 32bit executables?  On Intel or PPC?

As you noted the non-monotonic behavior wrt inlining decisions it would be
interesting if those differ for you, (5) rev. 198332 vs. 198333.  Add
-fdump-ipa-inline to the command-line and inspect the
aermod.f90.wpa.047i.inline
dumpfile, grepping for 'Inlined into'.  I only see changes in estimated
time/size but no real code changes.  I do see code layout changes though
and changes in LTRANS due to the extra copyprop pass.

Note that if -flto makes things worse compared to just -fwhole-program
(which it slightly does for me) then this is probably due to partitioning.
So you may also want to check -flto -flto-partition=none (slightly easier
to debug in the end - but without LTO it would be easiest).

Reply via email to