https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108856
Bug ID: 108856 Summary: Increment and decrement on std::experimental::where_expression should optimize better Product: gcc Version: 13.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: mkretz at gcc dot gnu.org Target Milestone: --- Target: x86_64-*-*, i?86-*-* #include <experimental/simd> namespace stdx = std::experimental; auto f(stdx::native_simd<int> a, stdx::native_simd_mask<int> k) { ++where(k, a); return a; } With AVX512 this should compile to a bitmask to vectormask conversion with subsequent subtraction: kmovw k0, edi vpbroadcastmw2d zmm1, k0 vpsubd zmm0, zmm0, zmm1 Instead we get: vmovdqa32 zmm1, zmm0 mov eax, 1 kmovw k1, edi vpbroadcastd zmm0, eax vmovdqa32 zmm2, zmm1 vpaddd zmm2{k1}, zmm1, zmm0 vmovdqa32 zmm0, zmm2 Without AVX512 this should compile to a single subtraction: vpsubd ymm0, ymm0, ymm1 Instead we get: mov eax, 1 vmovd xmm2, eax vpbroadcastd ymm2, xmm2 vpaddd ymm2, ymm0, ymm2 vpblendvb ymm0, ymm0, ymm2, ymm1