https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107212
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
So the speciality here is that with the SLP reduction we have the live lanes
split across the sum and the convert. That wrecks havoc with
vectorizable_reduction following one of the lanes in the loop assigning
STMT_VINFO_REDUC_DEF to the reduction chain. We simply do
/* ??? For epilogue generation live members of the chain need
to point back to the PHI via their original stmt for
info_for_reduction to work. */
if (STMT_VINFO_LIVE_P (vdef))
STMT_VINFO_REDUC_DEF (def) = phi_info;
but in this case this misses one of the paths. Also we're not reliably
following the representative here. Plus vectorizable_live_operation
doesn't get the representative but the actual scalar stmt defining the
live lane (on purpose). So the fix is to make sure the above setting
of STMT_VINFO_REDUC_DEF covers all live lanes of the SLP node. For
vectorizable_live_operation the
else
/* For SLP reductions the meta-info is attached to
the representative. */
stmt_info = SLP_TREE_REPRESENTATIVE (slp_node);
doing is then wrong and
/* For SLP reductions we vectorize the epilogue for
all involved stmts together. */
else if (slp_index != 0)
return true;
is also suspicious then but it seems we cope with the conversions just
fine. So we're actually vectorizing the epilogue for the live lane 0
in the reduction chain but analysis might end up not following the lane 0
SSA use-def chain and identifying lane > 0 reductions is just to avoid
non-reduction live code gen.