https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121662
Bug ID: 121662
Summary: Unnecessary data dependant branches with avx512 masks
Product: gcc
Version: 16.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: andi-gcc at firstfloor dot org
Target Milestone: ---
Target: x86_64
/* { dg-do compile } */
/* { dg-options " -mavx512vl -mavx512bw -mavx512f -O3" } */
void
ubyteshiftl_mask (unsigned char *a, int len)
{
int i;
for (i = 0; i < len; i++)
if (a[i] & 1)
a[i] <<= 5;
}
generates
kortestq %k1, %k1
je .L4
vpsllw $5, %zmm0, %zmm0
vpandq %zmm0, %zmm3, %zmm0
vmovdqu8 %zmm0, (%rdx){%k1}
.L4:
The branch is really unnecessary because the mask does all all the work
(although it should perhaps be applied to the computation too)
The problem is if a[i] & 1 is unpredictable this will slow down the loop due to
branch mispredicts. If we have masks we should avoid data dependent branches
because it needs no prediction.