------- Additional Comments From bangerth at dealii dot org 2004-12-01 20:49
-------
In reply to comment #6:
> Please note, that we should return the result in fp reg, so final flds is
> needed in any case. I think, this code is optimal.
Almost, or at least I believe so. If we assume that all the operations
with xmm registers cost the same as with the floating point stack, then
the result of -mfpmath=387,sse requires one stack push and pop more than
the result of -mfpmath=387. The compiler should recognize this and then
simply not use the sse registers at all.
I will open a new PR for this, and another one for the vectorization issue.
Thanks for now
Wolfgang
--
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17619