https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93342
Bug ID: 93342 Summary: wrong AVX mask generation with -funsafe-math-optimizations Product: gcc Version: 9.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: nathanael.schaeffer at gmail dot com Target Milestone: --- When trying to produce a xor mask to negate even elements in an AVX vector, gcc produces wrong code with -funsafe-math-optimizations. I've tried several ways, all giving the same wrong answer: a mask negating ALL elements instead of just the even ones. Since the mask is generated using INTEGER arithmetic, I don't understand the issue here. The only correct way with avx is to define a variable with the mask already set. With avx2, one can use integer intrinsics, which will produce correct mask. The code showing the bug can be seen here. https://godbolt.org/z/q9eamc For the record, I also copy the code below. When compiling the following with -O -mavx2 -funsafe-math-optimizations -S, the mask is wrong. Without -funsafe-math-optimizations it is correct. Since the mask is generated using integer arithmetic, I don't understand the issue here, as -funsafe-math-optimizations only affects floating point (according to man page). Even stranger, the same mask, but now xor-ed using integer avx2 intrinsics gives the correct resuts... #include <immintrin.h> typedef __m128d v2d; typedef __m256d v4d; // generates: vxorpd ymm0, ymm0, YMMWORD PTR wrong_mask v4d negate_even_fail(v4d v) { __m256i mask = _mm256_setr_epi32(0,-2147483648, 0,0, 0,-2147483648, 0,0); return _mm256_xor_pd(v, _mm256_castsi256_pd(mask)); } // generates: vxorpd ymm0, ymm0, YMMWORD PTR correct_mask v4d negate_even_does_not_fail(v4d v) { __m256i mask = _mm256_setr_epi32(0,-2147483648, 0,0, 0,-2147483648, 0,0); return _mm256_castsi256_pd(_mm256_xor_si256(_mm256_castpd_si256(v), mask)); }