http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56253
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> 2013-02-08 13:42:23 UTC --- (In reply to comment #3) > (In reply to comment #1) > > > not sure why we use builtins for these basic operations... > > Because they have to be emitted also for non-SSE math. > > From config/i386/sse.md: > > ;; The standard names for fma is only available with SSE math enabled. > (define_expand "fma<mode>4" > [(set (match_operand:FMAMODE 0 "register_operand") > (fma:FMAMODE > (match_operand:FMAMODE 1 "nonimmediate_operand") > (match_operand:FMAMODE 2 "nonimmediate_operand") > (match_operand:FMAMODE 3 "nonimmediate_operand")))] > "(TARGET_FMA || TARGET_FMA4) && TARGET_SSE_MATH") > > ... > > ;; The builtin for intrinsics is not constrained by SSE math enabled. > > (define_expand "fma4i_fmadd_<mode>" > [(set (match_operand:FMAMODE 0 "register_operand") > (fma:FMAMODE > (match_operand:FMAMODE 1 "nonimmediate_operand") > (match_operand:FMAMODE 2 "nonimmediate_operand") > (match_operand:FMAMODE 3 "nonimmediate_operand")))] > "TARGET_FMA || TARGET_FMA4") Hmm, I wonder how the vectorizer then accesses add/sub patterns without SSE math. It just queries optabs ... We cannot handle the FMA case with standard operations anyway. But if SSE modes are used, why should convert_mult_to_fma have to back off (it also just looks at standard optabs)? That said - should the above TARGET_SSE_MATH restriction not only apply to scalar modes?