On Wed, 14 Mar 2012, Joseph S. Myers wrote:
I'd say that "better performance with the potential loss of accuracy" should be covered by -ffast-math - that GCC should generate direct use of fsin/fcos instructions for sin/cos for -O2 -funsafe-math-optimizations on x86_64, as it does on x86, unless there is some reason to think they would perform worse than the out-of-line implementation.
Last time I did some timings (maybe 4 years ago), for double, fsin was slower than the libm software implementation compiled for x87, which was itself slower than the same implementation compiled for sse. And the software implementation was more precise than fsin. My conclusion was to ignore fsin from then on.
-- Marc Glisse