Hello!
You should use -ffast-math together with -D__NO_MATH_INLINES in your compile flags, or use a newer glibc. -D__NO_MATH_INLINES should also be used for -mfpmath=sse to prevent generation of x87 instructions from mathinline.h header when SSE code is used for FP math operators. Otherwise xmm reg->mem->x87 reg moves will kill your performance.If I compile with
$ ~/usr/bin/gcc-4.0.0 -S Com_Code.cc -ffast-math -O2
the relevant generated code section is
#APP fldln2; fxch; fyl2x #NO_APP fmulp %st, %st(2) fxch %st(1) #APP fsqrt #NO_APP fld %st(1) #APP fsin #NO_APP fxch %st(2) #APP fcos #NO_APP
So after generating R, a separate fsin and fcos seem to be generated. Am I missing an option or something?
Uros.
Uros.