[Bug lto/57290] [4.9 Regression] After r198333 the aermod runtime is ~10% slower when compiled with -fprotect-parens and -flto

dominiq at lps dot ens.fr Wed, 15 May 2013 10:00:44 -0700

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57290


--- Comment #2 from Dominique d'Humieres <dominiq at lps dot ens.fr> ---
There is a lot of noise in these numbers(?) 

Well, AFAICT aermod.f90 has a "non-monotonic" behavior for the different
optimizations: when playing with --param max-inline-insns-auto=xx, the
execution time was not decreasing for increasing xx, but went up or down
depending on which routine was inlined.

> the patch, apart from
>
> +       * passes.c (init_optimization_passes): Schedule a copy-propagation
> +       pass before complete unrolling of inner loops.
>
> should have had no effect on performance (well, in theory, that is).
> Can you check whether reverting the above part changes the results?

Nope

> Also, what's the variance of the numbers? 

Below 0.1s. 

> Are (1) to (4) effectively
> the same performance r198332 vs. r198333?  

Yes for (2) and (4). For (1) and (3), I think the performances are slightly
different. What triggered this PR is (5) (can you reproduce it?) versus (3),
i.e., -fprotect-parens versus -fwhole-program -flto.

> (make sure to disable
> address-space randomization for benchmarking)

I don't really know what you are talking about (I am using Darwin).

Profiling the executable obtained with -fprotect-parens -Ofast -funroll-loops
-ftree-loop-linear -fomit-frame-pointer -fwhole-program -flto gives

- 21.8%, iblval_.lto_priv.516, a.out
- 12.7%, sigz_.lto_priv.419, a.out
- 12.7%, powf$fenv_access_off, libSystem.B.dylib
  12.4%, anyavg_.constprop.50, a.out
- 5.6%, plumef_.lto_priv.580, a.out

and with -Ofast -funroll-loops -ftree-loop-linear -fomit-frame-pointer
-fwhole-program -flto:

- 14.7%, powf$fenv_access_off, libSystem.B.dylib
- 14.5%, iblval_.lto_priv.284, a.out
- 13.8%, sigz_.lto_priv.290, a.out
  13.7%, anyavg_.constprop.50, a.out
- 4.8%, refl_ht_.lto_priv.281, a.out
- 4.7%, rmssig_.lto_priv.298, a.out
  3.1%, _gfortran_compare_string, libgfortran.3.dylib

The subroutine takes ~4.5s for the first set of options and ~2.6s for the
second one.

[Bug lto/57290] [4.9 Regression] After r198333 the aermod runtime is ~10% slower when compiled with -fprotect-parens and -flto

Reply via email to