https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94790
Bug ID: 94790
Summary: Failure to use andn in specific pattern in which it is
available
Product: gcc
Version: 10.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: gabravier at gmail dot com
Target Milestone: ---
unsigned r1(unsigned a, unsigned b, unsigned mask)
{
return a ^ ((a ^ b) & mask);
}
unsigned r2(unsigned a, unsigned b, unsigned mask)
{
return (~mask & a) | (b & mask);
}
`r1` and `r2` are equivalent. `r2` is translated into `r1` by GCC. LLVM instead
transforms `r1` into `r2` when an "and not" instruction is available. I haven't
benchmarked the resulting code a lot, but basic measurements and llvm-mca
indicates that code using andn is faster than the code using the `r1` pattern
(and the code using andn takes 1 less instruction on x86)
Comparison of generated code with `-mbmi` : https://godbolt.org/z/2PUhBX