------- Comment #2 from dominiq at lps dot ens dot fr  2007-03-18 10:20 -------
Andrew,

Thanks for the answer. Additional timings for AMD Opteron(tm) Processor 250,
2.4Ghz:

Target: x86_64-unknown-linux-gnu
...
gcc version 4.3.0 20061231 (experimental)

[tocata] test/fortran> gfc -O3 sincos.f90 
[tocata] test/fortran> time a.out 
 -6.324121691031215E-002 -2.934957388823078E-003
19.847u 0.001s 0:20.41 97.2%    0+0k 0+0io 0pf+0w

[tocata] test/fortran> gfc -O3 sincos_o.f90
[tocata] test/fortran> time a.out
 -6.324121619598655E-002 -2.934957388823078E-003
19.793u 0.000s 0:19.80 99.9%    0+0k 0+0io 0pf+0w

[tocata] test/fortran> gfc -O3 cexp.f90 
[tocata] test/fortran> time a.out
 -6.324121619598655E-002 -2.934957388823078E-003
15.613u 0.000s 0:15.63 99.8%    0+0k 0+0io 0pf+0w

sin+cos is not optimized as cexpi.

Target: i386-pc-linux-gnu
...
gcc version 4.3.0 20070225 (experimental)

[tocata] test/fortran> gfc32 -Wa,-32 -O3 -fdump-tree-optimized sincos.f90
[tocata] test/fortran> time a.out
 -6.324122144403047E-002 -2.934963088285132E-003
10.757u 0.000s 0:10.76 99.9%    0+0k 0+0io 0pf+0w

[tocata] test/fortran> gfc32 -Wa,-32 -O3 -fdump-tree-optimized sincos_o.f90
tocata] test/fortran> time a.out 
 -6.324122124732012E-002 -2.934963117388848E-003
7.291u 0.001s 0:07.47 97.5%     0+0k 0+0io 4pf+0w

[tocata] test/fortran> gfc32 -Wa,-32 -O3 -fdump-tree-optimized cexp.f90
[tocata] test/fortran> time a.out
 -6.324122124732012E-002 -2.934963117388848E-003
11.412u 0.000s 0:11.41 100.0%   0+0k 0+0io 0pf+0w

sin+cos is optimized as cexpi which is faster than cexp -> real optimization!
The i386 code is almost twice as fast as the x86_64 one.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31249

Reply via email to