https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95764

            Bug ID: 95764
           Summary: Failure to optimize usage of _mm512_set1_epi32 to a
                    single instruction
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gabravier at gmail dot com
  Target Milestone: ---

__m512i f(__m512i a)
{
    return (_mm512_set1_epi32(0x7FFFFFFF) & a);
}

With -O3 -mavx512f, LLVM outputs this :

.LCPI0_0:
  .quad 9223372034707292159
f(long long __vector(8)):
  vpandq zmm0, zmm0, qword ptr [rip + .LCPI0_0]{1to8}
  ret

GCC outputs this :

f(long long __vector(8)):
  mov eax, 2147483647
  vpbroadcastd zmm1, eax
  vpandq zmm0, zmm0, zmm1
  ret

I'm not completely sure the LLVM version is better, but I'd rather file a bug
report (and be able to file one back to LLVM if I learn that GCC's code is
better) than just do nothing.

Reply via email to