https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61333

--- Comment #8 from Dominique d'Humieres <dominiq at lps dot ens.fr> ---
Some comments:

original shell: 1:1.86:2.9
+ -Ofast      : 1:1.37:1.8

(gcc 4.10.0 r210749). Does this mean that there is a problem with -Ofast and
-fopenmp?

The Wallclock time are:

original shell: 46.49s:25.83s:16.02s
+ -Ofast      : 7.82s:5.72s:4.21s

Estimating an Amdahl's law: s+p/n (s serial time, p parallel time, n number of
threads), based on n=1 and 2, gives

original shell: s= 5.17s, p=41.32s, time for n=4: 15.50s,
+ -Ofast      : s=3.63s, p=4.19s, time for n=4: 4.68s.

This crude estimate shows that the serial time is only slightly improved with
-Ofast while the parallel one is an order of magnitude faster with it.

Could you give the Wallclock time for the different cases in comment 0?

Reply via email to