> > We don't have an analysis phase for vect_transform_cycle_phi though since
> it's
> > a transform only method.  It seems to be that vectorizable_reduction is more
> > restrictive than the transform phases that it triggers
> (vect_transform_cycle_phi,
> > vect_transform_reduction).
> >
> > So I'm not sure where else we'd reject the must-use-reduc_epilogue though.
> So I
> > think it likely has to be one way?
> 
> The analysis phase for vect_transform_cycle_phi and
> vect_transform_reduction
> and vect_create_epilog_for_reduction (via vectorizable_live_operation) is
> vectorizable_reduction.  So we indeed have to reject the non-workable case
> there but we can use the flag to guide the heuristics in
> vect_transform_cycle_phi (short of replicating exactly the same
> conditions).
> 
> What I was saying there is that we still have (many) ad-hoc decisions
> made during transform which we should decide on during
> vectorizable_reduction and record as decision in a more simple form
> (that ideally includes whether we can re-use an accumulator, though
> that's a difficult thing as epilogue analysis has its own idea here).
> 

Ack, I've for now just allowed through the cases we know work.

--

The support for the new boolean reduction optabs didn't quite work for VLA
because the code later on insists on the target still having a shift-and-insert
optab.

This is however not needed if the target can do the reduction using the new
optabs, and the initial reduction value matches the neutral value and we
have one SLP lane while not having a reduction chain.

Bootstrapped Regtested on aarch64-none-linux-gnu,
arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
-m32, -m64 and no issues

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

        * tree-vect-loop.cc (vectorizable_reduction): Don't always require
        IFN_VEC_SHL_INSERT when using reduc sbool optabs.
-- inline copy of patch --

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 
6c2420249718237c4f70720b2bd03d4951bd8a5d..2925c21d97cb5887f6f0e2cb3a0c0e2cd38ae200
 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -7565,6 +7565,8 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
      values into the low-numbered elements.  */
   if ((double_reduc || neutral_op)
       && !nunits_out.is_constant ()
+      && (SLP_TREE_LANES (slp_node) != 1 && !reduc_chain)
+      && !operand_equal_p (neutral_op, vect_phi_initial_value (reduc_def_phi))
       && !direct_internal_fn_supported_p (IFN_VEC_SHL_INSERT,
                                          vectype_out, OPTIMIZE_FOR_SPEED))
     {

Attachment: rb19933.patch
Description: rb19933.patch

Reply via email to