On Wed, 22 Oct 2025, Tamar Christina wrote:
> > > We don't have an analysis phase for vect_transform_cycle_phi though since
> > it's
> > > a transform only method. It seems to be that vectorizable_reduction is
> > > more
> > > restrictive than the transform phases that it triggers
> > (vect_transform_cycle_phi,
> > > vect_transform_reduction).
> > >
> > > So I'm not sure where else we'd reject the must-use-reduc_epilogue though.
> > So I
> > > think it likely has to be one way?
> >
> > The analysis phase for vect_transform_cycle_phi and
> > vect_transform_reduction
> > and vect_create_epilog_for_reduction (via vectorizable_live_operation) is
> > vectorizable_reduction. So we indeed have to reject the non-workable case
> > there but we can use the flag to guide the heuristics in
> > vect_transform_cycle_phi (short of replicating exactly the same
> > conditions).
> >
> > What I was saying there is that we still have (many) ad-hoc decisions
> > made during transform which we should decide on during
> > vectorizable_reduction and record as decision in a more simple form
> > (that ideally includes whether we can re-use an accumulator, though
> > that's a difficult thing as epilogue analysis has its own idea here).
> >
>
> Ack, I've for now just allowed through the cases we know work.
>
> --
>
> The support for the new boolean reduction optabs didn't quite work for VLA
> because the code later on insists on the target still having a
> shift-and-insert
> optab.
>
> This is however not needed if the target can do the reduction using the new
> optabs, and the initial reduction value matches the neutral value and we
> have one SLP lane while not having a reduction chain.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu,
> arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> -m32, -m64 and no issues
>
> Ok for master?
OK.
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
> * tree-vect-loop.cc (vectorizable_reduction): Don't always require
> IFN_VEC_SHL_INSERT when using reduc sbool optabs.
> -- inline copy of patch --
>
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index
> 6c2420249718237c4f70720b2bd03d4951bd8a5d..2925c21d97cb5887f6f0e2cb3a0c0e2cd38ae200
> 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -7565,6 +7565,8 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
> values into the low-numbered elements. */
> if ((double_reduc || neutral_op)
> && !nunits_out.is_constant ()
> + && (SLP_TREE_LANES (slp_node) != 1 && !reduc_chain)
> + && !operand_equal_p (neutral_op, vect_phi_initial_value
> (reduc_def_phi))
> && !direct_internal_fn_supported_p (IFN_VEC_SHL_INSERT,
> vectype_out, OPTIMIZE_FOR_SPEED))
> {
>
--
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)