------- Additional Comments From bangerth at dealii dot org 2004-12-01 20:49 ------- In reply to comment #6: > Please note, that we should return the result in fp reg, so final flds is > needed in any case. I think, this code is optimal. Almost, or at least I believe so. If we assume that all the operations with xmm registers cost the same as with the floating point stack, then the result of -mfpmath=387,sse requires one stack push and pop more than the result of -mfpmath=387. The compiler should recognize this and then simply not use the sse registers at all. I will open a new PR for this, and another one for the vectorization issue. Thanks for now Wolfgang
-- What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17619