[Bug rtl-optimization/109973] New: Wrong code for AVX2 since 13.1 by combining VPAND and VPTEST

benjsith at gmail dot com via Gcc-bugs Thu, 25 May 2023 17:38:47 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109973


            Bug ID: 109973
           Summary: Wrong code for AVX2 since 13.1 by combining VPAND and
                    VPTEST
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: benjsith at gmail dot com
  Target Milestone: ---

The following code is a minimal repro of the issue, when compiled with `gcc -O1
-mavx2`:

#include <immintrin.h>

int do_stuff(__m256i Y0, __m256i Y1, __m128i X2) {
  __m256i And01 = _mm256_and_si256(Y0, Y1);
  int TestResult = _mm256_testc_si256(And01, And01);
  return TestResult;
}

I have also attached the preprocessed version of that minimal repro

12.3 produces the following assembly
        vpand   ymm0, ymm0, ymm1  ; <<<<< missing in 13.1
        mov     eax, 0
        vptest  ymm0, ymm0
        setb    al
        ret

While 13.1 generates:
        mov     eax, 0
        vptest  ymm0, ymm1
        setb    al
        ret

Note that as of 13.1, the VPAND is removed, and just the VPTEST remains.
However, this is incorrect because _mm256_testc_si256 returns the results of
the Carry Flag, which is based on the bitwise AND of the second operand and the
bitwise NOT of the first operand, meaning these two can't be combined like that
for this intrinsic

Here is a Godbolt link showing the issue with a full execution case:
https://godbolt.org/z/x9or4WEWh

This issue is present in 12.3 but not 13.1. A bisect shows that it was most
likely introduced in a56c1641e9d25e46059168e811b4a2f185f07b6b

I have confirmed that this issue is still present on the latest trunk,
8d2fa90a41567670d2dbd4918d19d21d9bec4a8f

-O0 on trunk will also return the correct result

For triage/priority purposes: this bug was not found in manually written code,
but instead from a fuzzer meant to test SIMD codegen

PS: This is my first bug on the GCC tracker, so if I've done anything wrong let
me know. I marked it as in "rtl-optimization" though I'm not sure if that's
correct

[Bug rtl-optimization/109973] New: Wrong code for AVX2 since 13.1 by combining VPAND and VPTEST

Reply via email to