[Bug tree-optimization/109088] GCC does not always vectorize conditional reduction

juzhe.zhong at rivai dot ai via Gcc-bugs Tue, 26 Sep 2023 19:58:56 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109088


--- Comment #8 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
It's because the order of the operations we are doing:

For code as follows:

result += mask ? a[i] + x : 0;

GCC:
result_ssa_1 = PHI <result_ssa_2, 0>
...
STMT 1. tmp = a[i] + x;
STMT 2. tmp2 = tmp + result_ssa_1;
STMT 3. result_ssa_2 = mask ? tmp2 : result_ssa_1;

Here we can see both STMT 2 and STMT 3 are using 'result_ssa_1',
we end up with 2 uses of the PHI result. Then, we failed to vectorize.

Wheras LLVM:

result_ssa_1 = PHI <result_ssa_2, 0>
...
IR 1. tmp = a[i] + x;
IR 2. tmp2 = mask ? tmp : 0;
IR 3. result_ssa_2 = tmp2 + result_ssa_1.

LLVM only has 1 use.

Is it reasonable to swap the order in match.pd ?

[Bug tree-optimization/109088] GCC does not always vectorize conditional reduction

Reply via email to