https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110202
Bug ID: 110202 Summary: _mm512_ternarylogic_epi64 generates unnecessary operations Product: gcc Version: 13.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: fabio at cannizzo dot net Target Milestone: --- Consider the following two alternative implementations of a bitwise complement of an avx512 register. #include <immintrin.h> __m512i negate1(const __m512i *a) { __m512i res; res = c(res, res, *a, 0x55); return res; } __m512i negate2(const __m512i *a) { __m512i res; res = _mm512_xor_si512(*a, _mm512_set1_epi32(-1)); return res; } which compiled with "-O3 -mavx512f" generates the asm listings (see godbolt: https://godbolt.org/z/jvrxEjW65) negate1(long long __vector(8) const*): vpxor xmm0, xmm0, xmm0 vpternlogq zmm0, zmm0, ZMMWORD PTR [rdi], 85 ret negate2(long long __vector(8) const*): vpternlogd zmm0, zmm0, ZMMWORD PTR [rdi], 0x55 ret negate1 introduces an unnecessary xor operation. Probably this is because it does not recognize that, when vpternlogd is used with code 0x55, it only uses the third zmm argument.