https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97299
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
OK, so this is a bad interaction of pattern detection and SLP reduction
vectorization. We're detecting
/home/rguenther/src/trunk/gcc/testsuite/gcc.dg/vect/slp-reduc-3.c:26:14: note:
widen_sum pattern recognized: patt_47 = prod_24 w+ res2_31;
which in turn empties LOOP_VINFO_REDUCTIONS via
/* Patterns cannot be vectorized using SLP, because they change the order of
computation. */
if (loop_vinfo)
{
unsigned ix, ix2;
stmt_vec_info *elem_ptr;
VEC_ORDERED_REMOVE_IF (LOOP_VINFO_REDUCTIONS (loop_vinfo), ix, ix2,
elem_ptr, *elem_ptr == stmt_info);
}
which means we're ending up with interleaving (and worse code).
The sentence above should only apply to reduction ops performing lane
reduction (SAD_EXPR, but also WIDEN_SUM_EXPR).