https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108856

            Bug ID: 108856
           Summary: Increment and decrement on
                    std::experimental::where_expression should optimize
                    better
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: mkretz at gcc dot gnu.org
  Target Milestone: ---
            Target: x86_64-*-*, i?86-*-*

#include <experimental/simd>

namespace stdx = std::experimental;

auto f(stdx::native_simd<int> a, stdx::native_simd_mask<int> k)
{
  ++where(k, a);
  return a;
}

With AVX512 this should compile to a bitmask to vectormask conversion with
subsequent subtraction:
        kmovw   k0, edi
        vpbroadcastmw2d zmm1, k0
        vpsubd  zmm0, zmm0, zmm1

Instead we get:
  vmovdqa32 zmm1, zmm0
  mov eax, 1
  kmovw k1, edi
  vpbroadcastd zmm0, eax
  vmovdqa32 zmm2, zmm1
  vpaddd zmm2{k1}, zmm1, zmm0
  vmovdqa32 zmm0, zmm2

Without AVX512 this should compile to a single subtraction:
        vpsubd  ymm0, ymm0, ymm1

Instead we get:
  mov eax, 1
  vmovd xmm2, eax
  vpbroadcastd ymm2, xmm2
  vpaddd ymm2, ymm0, ymm2
  vpblendvb ymm0, ymm0, ymm2, ymm1

Reply via email to