https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109973
Bug ID: 109973
Summary: Wrong code for AVX2 since 13.1 by combining VPAND and
VPTEST
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: benjsith at gmail dot com
Target Milestone: ---
The following code is a minimal repro of the issue, when compiled with `gcc -O1
-mavx2`:
#include <immintrin.h>
int do_stuff(__m256i Y0, __m256i Y1, __m128i X2) {
__m256i And01 = _mm256_and_si256(Y0, Y1);
int TestResult = _mm256_testc_si256(And01, And01);
return TestResult;
}
I have also attached the preprocessed version of that minimal repro
12.3 produces the following assembly
vpand ymm0, ymm0, ymm1 ; <<<<< missing in 13.1
mov eax, 0
vptest ymm0, ymm0
setb al
ret
While 13.1 generates:
mov eax, 0
vptest ymm0, ymm1
setb al
ret
Note that as of 13.1, the VPAND is removed, and just the VPTEST remains.
However, this is incorrect because _mm256_testc_si256 returns the results of
the Carry Flag, which is based on the bitwise AND of the second operand and the
bitwise NOT of the first operand, meaning these two can't be combined like that
for this intrinsic
Here is a Godbolt link showing the issue with a full execution case:
https://godbolt.org/z/x9or4WEWh
This issue is present in 12.3 but not 13.1. A bisect shows that it was most
likely introduced in a56c1641e9d25e46059168e811b4a2f185f07b6b
I have confirmed that this issue is still present on the latest trunk,
8d2fa90a41567670d2dbd4918d19d21d9bec4a8f
-O0 on trunk will also return the correct result
For triage/priority purposes: this bug was not found in manually written code,
but instead from a fuzzer meant to test SIMD codegen
PS: This is my first bug on the GCC tracker, so if I've done anything wrong let
me know. I marked it as in "rtl-optimization" though I'm not sure if that's
correct