https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94790

            Bug ID: 94790
           Summary: Failure to use andn in specific pattern in which it is
                    available
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gabravier at gmail dot com
  Target Milestone: ---

unsigned r1(unsigned a, unsigned b, unsigned mask)
{
    return a ^ ((a ^ b) & mask);
}

unsigned r2(unsigned a, unsigned b, unsigned mask)
{
    return (~mask & a) | (b & mask);
}

`r1` and `r2` are equivalent. `r2` is translated into `r1` by GCC. LLVM instead
transforms `r1` into `r2` when an "and not" instruction is available. I haven't
benchmarked the resulting code a lot, but basic measurements and llvm-mca
indicates that code using andn is faster than the code using the `r1` pattern
(and the code using andn takes 1 less instruction on x86)

Comparison of generated code with `-mbmi` : https://godbolt.org/z/2PUhBX

Reply via email to