https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107212

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
So the speciality here is that with the SLP reduction we have the live lanes
split across the sum and the convert.  That wrecks havoc with
vectorizable_reduction following one of the lanes in the loop assigning
STMT_VINFO_REDUC_DEF to the reduction chain.  We simply do

      /* ???  For epilogue generation live members of the chain need
         to point back to the PHI via their original stmt for
         info_for_reduction to work.  */
      if (STMT_VINFO_LIVE_P (vdef))
        STMT_VINFO_REDUC_DEF (def) = phi_info;

but in this case this misses one of the paths.  Also we're not reliably
following the representative here.  Plus vectorizable_live_operation
doesn't get the representative but the actual scalar stmt defining the
live lane (on purpose).  So the fix is to make sure the above setting
of STMT_VINFO_REDUC_DEF covers all live lanes of the SLP node.  For
vectorizable_live_operation the

          else
            /* For SLP reductions the meta-info is attached to
               the representative.  */
            stmt_info = SLP_TREE_REPRESENTATIVE (slp_node);

doing is then wrong and

          /* For SLP reductions we vectorize the epilogue for
             all involved stmts together.  */
          else if (slp_index != 0)
            return true;

is also suspicious then but it seems we cope with the conversions just
fine.  So we're actually vectorizing the epilogue for the live lane 0
in the reduction chain but analysis might end up not following the lane 0
SSA use-def chain and identifying lane > 0 reductions is just to avoid
non-reduction live code gen.

Reply via email to