https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121126
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> --- So one thing might be that as we have four copies of vect_b_lsm in the loop and we want to get at the last active (according to len) element in the epilog. We try to do else if (LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo)) { /* Emit: SCALAR_RES = VEC_EXTRACT <VEC_LHS, LEN + BIAS - 1> where VEC_LHS is the vectorized live-out result and MASK is the loop mask for the final iteration. */ gcc_assert (ncopies == 1 && (!slp_node || SLP_TREE_LANES (slp_node) == 1)); gimple_seq tem = NULL; gimple_stmt_iterator gsi = gsi_last (tem); tree len = vect_get_loop_len (loop_vinfo, &gsi, &LOOP_VINFO_LENS (loop_vinfo), 1, vectype, 0, 1); but that simply scales the loop len by 1/4 and then accesses the len/4 - 1 element of the last vector: # loop_len_64 = PHI <_63(7)> _58 = loop_len_64 >> 2; _59 = _58 + 18446744073709551615; _60 = .VEC_EXTRACT (vect_b_lsm.22_57, _59); but when we're in the last iteration len might not even cover all of the vectors. There is the assert on ncopies == 1, but that's not effective here (and not checked during analysis) - possibly to guard against this issue. The fully masked path might also be affected by this. But as the vect_b_lsm vectors are uniform and fully populated with the correct value this cannot be the issue of the runtime fail. On the other hand the ncopies thing is probably verified during reduction analysis but the inner loop PHI isn't from such, and LC PHI analysis does not do any such verification, because we expect vectorizable_live_operation to do it, but that's short-cut for lc_phi_info_type nodes.