With gfortran and g++ the computation of cos(x) and sin(x) is "optimized" by
taking
the real and imaginary parts of cexpi(x) (at least it is what I understand).
This is working
if and only if the computation of cexpi(x) is faster than the sum of the
separate computations 
of cos(x) and sin(x). 

Now consider the following code:

integer, parameter :: n=1000000
integer :: i
real(8) :: pi, ss, sc, t, dt
pi = acos(-1.0d0)
dt=pi/n
sc=0
ss=0
t=0
do i= 1, 100*n
  sc = sc + cos(t-dt)
  ss = ss + sin(t)
  t = t+dt
end do
print *, sc, ss
end

the result is (G5 1.8Ghz, OSX 10.3.9):

[karma] bug/timing% gfc -O3 sincos.f90 
[karma] bug/timing% time a.out 
 -6.324121638644320E-002 -2.934958087315009E-003
13.020u 0.050s 0:13.59 96.1%    0+0k 0+2io 0pf+0w

It is easy to see that I have fooled the optimizer with the line

  sc = sc + cos(t-dt)

If I replace it by:

  sc = sc + cos(t)

the result is now (over a 50% increase of the CPU time):

[karma] bug/timing% gfc -O3 sincos_o.f90
[karma] bug/timing% time a.out
 -6.324121573032526E-002 -2.934958087315009E-003
21.740u 0.080s 0:22.18 98.3%    0+0k 0+2io 0pf+0w

to be compared with the result of the code:

integer, parameter :: n=1000000
integer :: i
real(8) :: pi, ss, sc, t, dt
complex(8) :: z, dz
pi = acos(-1.0d0)
dt=pi/n
dz=cmplx(0.0d0,dt,8)
sc=0
ss=0
z=0
do i= 1, 100*n
  sc = sc + real(exp(z))
  ss = ss + aimag(exp(z))
  z = z+dz
end do
print *, sc, ss
end

is

[karma] bug/timing% gfc -O3 cexp.f90
[karma] bug/timing% time a.out
 -6.324121573032526E-002 -2.934958087315009E-003
20.850u 0.110s 0:21.45 97.7%    0+0k 0+2io 0pf+0w

Following the comments in PR #30969, 30980, and 31161, I have understood that
on OSX cexpi "fallback" to cexp in perfect agreement with the above timings.

So it would probably nice to disable the sincos "optimisation" on platforms
that
do not support fast cexpi such as OSX (as presently configured).

Note that on Sat, 30 Sep 2006 in

http://gcc.gnu.org/ml/fortran/2006-09/msg00454.html

I have reported (in vain) a timing regression for the fatigue.f90 polyhedron
test case.
Is this related to this pseudo-optimization or to another change?


-- 
           Summary: pseudo-optimzation with sincos/cexpi
           Product: gcc
           Version: 4.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: dominiq at lps dot ens dot fr
GCC target triplet: powerpc-apple-darwin7


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31249

Reply via email to