https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109088

--- Comment #6 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
After investigations:

GCC failed to vectorize reduction with multiple conditional operations:

ifcvt dump:

# result_20 = PHI <result_9(8), 0(18)>
...
_11 = result_20 + 10;
result_17 = _4 + _11;
_23 = _4 > _7;
result_9 = _23 ? result_17 : result_20;

It's odd that GCC failed to vectorize it since they are not complicate
statements.

In LLVM, it will vectorize them into:

vector_ssa_2 = <vector_ssa_result, 0>
...
vector_ssa_1 = vector_ssa_2 + 10;
vector_ssa_3 = vector_ssa_1 + 10;
mask_ssa_1 = vector_ssa_4 > vector_ssa_5;
vector_ssa_result = select <mask_ssa_1, vector_ssa_3, vector_ssa_2>

I think GCC should be able to vectorize it like LLVM:

vector_ssa_2 = <vector_ssa_result, 0>
...
vector_ssa_1 = vector_ssa_2 + 10;
vector_ssa_3 = vector_ssa_1 + 10;
mask_ssa_1 = vector_ssa_4 > vector_ssa_5;
vector_ssa_result = VCOND_MASK <mask_ssa_1, vector_ssa_3, vector_ssa_2>

I saw this code disable the vectorization:
      else if (!bbs.is_empty ()
               && bb->loop_father->header == bb
               && bb->loop_father->dont_vectorize)
        {
          if (dump_enabled_p ())
            dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
                             "splitting region at dont-vectorize loop %d "
                             "entry at bb%d\n",
                             bb->loop_father->num, bb->index);
          split = true;
        }

I am not familiar with these codes, any ideas ? Thanks.

Reply via email to