http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56253
Bug #: 56253 Summary: fp-contract does not work with SSE and AVX FMAs (neither FMA4 nor FMA3) Classification: Unclassified Product: gcc Version: 4.7.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: kr...@kde.org Take the following testcase: #include <immintrin.h> __m256 foo(__m256 a, __m256 b, __m256 c) { return _mm256_add_ps(_mm256_mul_ps(a, b), c); } __m128 foo(__m128 a, __m128 b, __m128 c) { return _mm_add_ps(_mm_mul_ps(a, b), c); } float foo(float a, float b, float c) { return a * b + c; } compiled with 'g++ -O3 -mfma -ffp-contract=fast -fabi-version=0 -c' only the third function uses fmas (same for -mfma4). The SSE and AVX variant should make the same contraction as is implemented for scalar operations.