https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121059
--- Comment #10 from Richard Sandiford <rsandifo at gcc dot gnu.org> --- (In reply to Richard Biener from comment #9) > vectorizable_operation during transform does > > /* When combining two masks check if either of them is elsewhere > combined with a loop mask, if that's the case we can mark that > the > new combined mask doesn't need to be combined with a loop mask. > */ > if (masked_loop_p > && code == BIT_AND_EXPR > && VECTOR_BOOLEAN_TYPE_P (vectype)) > { > if (loop_vinfo->scalar_cond_masked_set.contains ({ op0, 1 })) > { > mask = vect_get_loop_mask (loop_vinfo, gsi, masks, > vec_num, vectype, i); > > but that's not reflected by analysis, which misses to record a loop mask > for !mask_out_inactive operations. So the fix is as simple as the following, > but this might put us to using masks? There is no good way to do this > I guess. The scalar_cond_masked_set optimization does not have a > corresponding > len operation. I'm not sure what we can do here? > > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc > index 4aa69da2218..55002bd0cc2 100644 > --- a/gcc/tree-vect-stmts.cc > +++ b/gcc/tree-vect-stmts.cc > @@ -6978,6 +6978,16 @@ vectorizable_operation (vec_info *vinfo, > LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false; > } > } > + else if (loop_vinfo > + && LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) > + && code == BIT_AND_EXPR > + && VECTOR_BOOLEAN_TYPE_P (vectype)) > + vect_record_loop_mask (loop_vinfo, masks, vec_num, vectype, NULL); > > /* Put types on constant and invariant SLP children. */ > if (!vect_maybe_update_slp_op_vectype (slp_op0, vectype) Yeah, we shouldn't do that. The question is why op0 is in scalar_cond_masked_set with masked_loop_p true if there's no associated loop mask.