https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122776

            Bug ID: 122776
           Summary: vectorizable_simd_clone_call looks at
                    LOOP_VINFO_FULLY_MASKED_P during analysis
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

if (masked_call_offset == 0
            && n->simdclone->inbranch
            && n->simdclone->nargs > nargs)
          {
            gcc_assert (n->simdclone->args[n->simdclone->nargs - 1].arg_type ==
                        SIMD_CLONE_ARG_TYPE_MASK);
            /* Penalize using a masked SIMD clone in a non-masked loop, that is
               not in a branch, as we'd have to construct an all-true mask.  */
            if (!loop_vinfo || !LOOP_VINFO_FULLY_MASKED_P (loop_vinfo))
              this_badness += 64;

but LOOP_VINFO_FULLY_MASKED_P is always false until we call
vect_determine_partial_vectors_and_peeling at the very end of analysis.

This should result in always selecting a not in-branch variant of a simdclone
and thus disabled loop masking.

All LOOP_VINFO_{CAN,MUST}_USE_PARTIAL_VECTORS_P are transient during analysis,
so we probably need to discover both a "best" simdclone to use for the masked
case and a "best" one for the not masked case.

Reply via email to