https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112325
--- Comment #4 from liuhongt at gcc dot gnu.org --- (In reply to liuhongt from comment #3) > BB vectorizer relies on the backend support of .REDUC_PLUS for reduction, > but loop vectorizer can manually do reduction. That's why it's not > vectorized after cunrolli. > > After adding reduc_plus_scal_v4si, it's vectorized. > Looks like we need to support > reduc_plus_scal_{v4si,v8si,v16si,v8hi,v16hi,v32hi} > Similar for reduc_{and,ior,xor}_scal_m. This one can be vectorized after support reduc_plus_scal_v4si, but the original loop is still not vectorized, should be other issues. int foo (int* a) { int sum = 0; sum += a[0]; sum += a[1]; sum += a[2]; sum += a[3]; return sum; }