https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121126

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot 
gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
So one thing might be that as we have four copies of vect_b_lsm in the loop
and we want to get at the last active (according to len) element in the
epilog.  We try to do

  else if (LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo))
    {
      /* Emit:

         SCALAR_RES = VEC_EXTRACT <VEC_LHS, LEN + BIAS - 1>

         where VEC_LHS is the vectorized live-out result and MASK is
         the loop mask for the final iteration.  */
      gcc_assert (ncopies == 1
                  && (!slp_node || SLP_TREE_LANES (slp_node) == 1));
      gimple_seq tem = NULL;
      gimple_stmt_iterator gsi = gsi_last (tem);
      tree len = vect_get_loop_len (loop_vinfo, &gsi,
                                    &LOOP_VINFO_LENS (loop_vinfo),
                                    1, vectype, 0, 1);

but that simply scales the loop len by 1/4 and then accesses the len/4 - 1
element of the last vector:

  # loop_len_64 = PHI <_63(7)>
  _58 = loop_len_64 >> 2;
  _59 = _58 + 18446744073709551615;
  _60 = .VEC_EXTRACT (vect_b_lsm.22_57, _59);

but when we're in the last iteration len might not even cover all of the
vectors.

There is the assert on ncopies == 1, but that's not effective here (and
not checked during analysis) - possibly to guard against this issue.

The fully masked path might also be affected by this.

But as the vect_b_lsm vectors are uniform and fully populated with the correct
value this cannot be the issue of the runtime fail.

On the other hand the ncopies thing is probably verified during reduction
analysis but the inner loop PHI isn't from such, and LC PHI analysis does
not do any such verification, because we expect vectorizable_live_operation
to do it, but that's short-cut for lc_phi_info_type nodes.

Reply via email to