https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97299
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> --- OK, so this is a bad interaction of pattern detection and SLP reduction vectorization. We're detecting /home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/slp-reduc-3.c:26:14: note: widen_sum pattern recognized: patt_47 = prod_24 w+ res2_31; which in turn empties LOOP_VINFO_REDUCTIONS via /* Patterns cannot be vectorized using SLP, because they change the order of computation. */ if (loop_vinfo) { unsigned ix, ix2; stmt_vec_info *elem_ptr; VEC_ORDERED_REMOVE_IF (LOOP_VINFO_REDUCTIONS (loop_vinfo), ix, ix2, elem_ptr, *elem_ptr == stmt_info); } which means we're ending up with interleaving (and worse code). The sentence above should only apply to reduction ops performing lane reduction (SAD_EXPR, but also WIDEN_SUM_EXPR).