https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108856
Bug ID: 108856
Summary: Increment and decrement on
std::experimental::where_expression should optimize
better
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: libstdc++
Assignee: unassigned at gcc dot gnu.org
Reporter: mkretz at gcc dot gnu.org
Target Milestone: ---
Target: x86_64-*-*, i?86-*-*
#include <experimental/simd>
namespace stdx = std::experimental;
auto f(stdx::native_simd<int> a, stdx::native_simd_mask<int> k)
{
++where(k, a);
return a;
}
With AVX512 this should compile to a bitmask to vectormask conversion with
subsequent subtraction:
kmovw k0, edi
vpbroadcastmw2d zmm1, k0
vpsubd zmm0, zmm0, zmm1
Instead we get:
vmovdqa32 zmm1, zmm0
mov eax, 1
kmovw k1, edi
vpbroadcastd zmm0, eax
vmovdqa32 zmm2, zmm1
vpaddd zmm2{k1}, zmm1, zmm0
vmovdqa32 zmm0, zmm2
Without AVX512 this should compile to a single subtraction:
vpsubd ymm0, ymm0, ymm1
Instead we get:
mov eax, 1
vmovd xmm2, eax
vpbroadcastd ymm2, xmm2
vpaddd ymm2, ymm0, ymm2
vpblendvb ymm0, ymm0, ymm2, ymm1