On Wed, 22 Oct 2025, Tamar Christina wrote:

> > > We don't have an analysis phase for vect_transform_cycle_phi though since
> > it's
> > > a transform only method.  It seems to be that vectorizable_reduction is 
> > > more
> > > restrictive than the transform phases that it triggers
> > (vect_transform_cycle_phi,
> > > vect_transform_reduction).
> > >
> > > So I'm not sure where else we'd reject the must-use-reduc_epilogue though.
> > So I
> > > think it likely has to be one way?
> > 
> > The analysis phase for vect_transform_cycle_phi and
> > vect_transform_reduction
> > and vect_create_epilog_for_reduction (via vectorizable_live_operation) is
> > vectorizable_reduction.  So we indeed have to reject the non-workable case
> > there but we can use the flag to guide the heuristics in
> > vect_transform_cycle_phi (short of replicating exactly the same
> > conditions).
> > 
> > What I was saying there is that we still have (many) ad-hoc decisions
> > made during transform which we should decide on during
> > vectorizable_reduction and record as decision in a more simple form
> > (that ideally includes whether we can re-use an accumulator, though
> > that's a difficult thing as epilogue analysis has its own idea here).
> > 
> 
> Ack, I've for now just allowed through the cases we know work.
> 
> --
> 
> The support for the new boolean reduction optabs didn't quite work for VLA
> because the code later on insists on the target still having a 
> shift-and-insert
> optab.
> 
> This is however not needed if the target can do the reduction using the new
> optabs, and the initial reduction value matches the neutral value and we
> have one SLP lane while not having a reduction chain.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu,
> arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> -m32, -m64 and no issues
> 
> Ok for master?

OK.

> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>       * tree-vect-loop.cc (vectorizable_reduction): Don't always require
>       IFN_VEC_SHL_INSERT when using reduc sbool optabs.
> -- inline copy of patch --
> 
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 
> 6c2420249718237c4f70720b2bd03d4951bd8a5d..2925c21d97cb5887f6f0e2cb3a0c0e2cd38ae200
>  100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -7565,6 +7565,8 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
>       values into the low-numbered elements.  */
>    if ((double_reduc || neutral_op)
>        && !nunits_out.is_constant ()
> +      && (SLP_TREE_LANES (slp_node) != 1 && !reduc_chain)
> +      && !operand_equal_p (neutral_op, vect_phi_initial_value 
> (reduc_def_phi))
>        && !direct_internal_fn_supported_p (IFN_VEC_SHL_INSERT,
>                                         vectype_out, OPTIMIZE_FOR_SPEED))
>      {
> 

-- 
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to