On Wed, 3 Jul 2024, Richard Sandiford wrote:

> Richard Biener <rguent...@suse.de> writes:
> > The following starts to handle NULL elements in SLP_TREE_SCALAR_STMTS
> > with the first candidate being the two-operator nodes where some
> > lanes are do-not-care and also do not have a scalar stmt computing
> > the result.  I originally added SLP_TREE_SCALAR_STMTS to two-operator
> > nodes but this exposes PR115764, so I've split that out.
> >
> > I have a patch use NULL elements for loads from groups with gaps
> > where we get around not doing that by having a load permutation.
> >
> > I'm currently re-bootstrapping and testing this, it passed multiple
> > testing rounds before (with the two-operator change).
> 
> Ah, so this might answer a question I was going to ask later,
> when I actually had time to do something with the answer.
> 
> For SVE:
> 
>       short *s;
> 
>       s[0] += 1;
>       s[2] += 2;
>       s[4] += 3;
>       s[6] += 4;
> 
> (contrived example) could use something like (little-endian):
> 
>       index   z1.s, #1, #1
>       ptrue   p0.s, vl4
>       ldrh    z0.h, p0/z, [s]
>       add     z0.h, z0.h, z1.h
>       strh    z0.h, p0, [s]
> 
> where the .h predicate is 1,0,1,0,1,0,1,0,0s...  The question was going
> to be how we should represent that in SLP, but I guess the answer is
> simply to create an 8 (or 7?) element SLP tree and fill in the blanks
> with nulls?

For now a NULL means there's no scalar stmt representing the computed
value and I think all existing uses would even support this to be
the "don't-care" representation.  For the above case both would work
for the load side but of course on the store side you still want to
have a .MASK_STORE with an explicit mask.  That is, NULLs are not
representative of the code we need to generate but rather just map
the vector SLP to the scalar definitions.

So I guess the answer to your question would be "yes" but of course
that's only half of the story - we do not support gaps for stores
at all at the moment.

I'm a bit undecided as to whether a NULL can be "do-not-care" yet.
The bst_traits::equal function treats it in an odd way - matching
NULL with NULL only, which is neither do-not-care (we could match
with any other def) nor "no scalar def but specific value required"
(we cannot match NULL with NULL).

Thanks,
Richard.

> Thanks,
> Richard
> 
> >
> > Richard.
> >
> >     * tree-vect-slp.cc (bst_traits::hash): Handle NULL elements
> >     in SLP_TREE_SCALAR_STMTS.
> >     (vect_print_slp_tree): Likewise.
> >     (vect_mark_slp_stmts): Likewise.
> >     (vect_mark_slp_stmts_relevant): Likewise.
> >     (vect_find_last_scalar_stmt_in_slp): Likewise.
> >     (vect_bb_slp_mark_live_stmts): Likewise.
> >     (vect_slp_prune_covered_roots): Likewise.
> >     (vect_bb_partition_graph_r): Likewise.
> >     (vect_remove_slp_scalar_calls): Likewise.
> >     (vect_slp_gather_vectorized_scalar_stmts): Likewise.
> >     (vect_bb_slp_scalar_cost): Likewise.
> >     (vect_contains_pattern_stmt_p): Likewise.
> >     (vect_slp_convert_to_external): Likewise.
> >     (vect_find_first_scalar_stmt_in_slp): Likewise.
> >     (vect_optimize_slp_pass::remove_redundant_permutations): Likewise.
> >     (vect_slp_analyze_node_operations_1): Likewise.
> >     (vect_schedule_slp_node): Likewise.
> >     * tree-vect-stmts.cc (can_vectorize_live_stmts): Likewise.
> >     (vectorizable_shift): Likewise.
> >     * tree-vect-data-refs.cc (vect_slp_analyze_load_dependences):
> >     Handle NULL elements in SLP_TREE_SCALAR_STMTS.
> > ---
> >  gcc/tree-vect-data-refs.cc |  2 +
> >  gcc/tree-vect-slp.cc       | 76 +++++++++++++++++++++++---------------
> >  gcc/tree-vect-stmts.cc     | 22 ++++++-----
> >  3 files changed, 61 insertions(+), 39 deletions(-)
> >
> > diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
> > index 959e127c385..39fd887a96b 100644
> > --- a/gcc/tree-vect-data-refs.cc
> > +++ b/gcc/tree-vect-data-refs.cc
> > @@ -1041,6 +1041,8 @@ vect_slp_analyze_load_dependences (vec_info *vinfo, 
> > slp_tree node,
> >  
> >    for (unsigned k = 0; k < SLP_TREE_SCALAR_STMTS (node).length (); ++k)
> >      {
> > +      if (! SLP_TREE_SCALAR_STMTS (node)[k])
> > +   continue;
> >        stmt_vec_info access_info
> >     = vect_orig_stmt (SLP_TREE_SCALAR_STMTS (node)[k]);
> >        if (access_info == first_access_info)
> > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> > index b060161c021..7a9aa86f517 100644
> > --- a/gcc/tree-vect-slp.cc
> > +++ b/gcc/tree-vect-slp.cc
> > @@ -356,7 +356,7 @@ vect_contains_pattern_stmt_p (vec<stmt_vec_info> stmts)
> >    stmt_vec_info stmt_info;
> >    unsigned int i;
> >    FOR_EACH_VEC_ELT (stmts, i, stmt_info)
> > -    if (is_pattern_stmt_p (stmt_info))
> > +    if (stmt_info && is_pattern_stmt_p (stmt_info))
> >        return true;
> >    return false;
> >  }
> > @@ -1592,7 +1592,7 @@ bst_traits::hash (value_type x)
> >  {
> >    inchash::hash h;
> >    for (unsigned i = 0; i < x.length (); ++i)
> > -    h.add_int (gimple_uid (x[i]->stmt));
> > +    h.add_int (x[i] ? gimple_uid (x[i]->stmt) : -1);
> >    return h.end ();
> >  }
> >  inline bool
> > @@ -2801,9 +2801,12 @@ vect_print_slp_tree (dump_flags_t dump_kind, 
> > dump_location_t loc,
> >      }
> >    if (SLP_TREE_SCALAR_STMTS (node).exists ())
> >      FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt_info)
> > -      dump_printf_loc (metadata, user_loc, "\t%sstmt %u %G",
> > -                  STMT_VINFO_LIVE_P (stmt_info) ? "[l] " : "",
> > -                  i, stmt_info->stmt);
> > +      if (stmt_info)
> > +   dump_printf_loc (metadata, user_loc, "\t%sstmt %u %G",
> > +                    STMT_VINFO_LIVE_P (stmt_info) ? "[l] " : "",
> > +                    i, stmt_info->stmt);
> > +      else
> > +   dump_printf_loc (metadata, user_loc, "\tstmt %u ---\n", i);
> >    else
> >      {
> >        dump_printf_loc (metadata, user_loc, "\t{ ");
> > @@ -2944,7 +2947,8 @@ vect_mark_slp_stmts (slp_tree node, 
> > hash_set<slp_tree> &visited)
> >      return;
> >  
> >    FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt_info)
> > -    STMT_SLP_TYPE (stmt_info) = pure_slp;
> > +    if (stmt_info)
> > +      STMT_SLP_TYPE (stmt_info) = pure_slp;
> >  
> >    FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
> >      if (child)
> > @@ -2974,11 +2978,12 @@ vect_mark_slp_stmts_relevant (slp_tree node, 
> > hash_set<slp_tree> &visited)
> >      return;
> >  
> >    FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt_info)
> > -    {
> > -      gcc_assert (!STMT_VINFO_RELEVANT (stmt_info)
> > -                  || STMT_VINFO_RELEVANT (stmt_info) == 
> > vect_used_in_scope);
> > -      STMT_VINFO_RELEVANT (stmt_info) = vect_used_in_scope;
> > -    }
> > +    if (stmt_info)
> > +      {
> > +   gcc_assert (!STMT_VINFO_RELEVANT (stmt_info)
> > +               || STMT_VINFO_RELEVANT (stmt_info) == vect_used_in_scope);
> > +   STMT_VINFO_RELEVANT (stmt_info) = vect_used_in_scope;
> > +      }
> >  
> >    FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
> >      if (child)
> > @@ -3029,10 +3034,11 @@ vect_find_last_scalar_stmt_in_slp (slp_tree node)
> >    stmt_vec_info stmt_vinfo;
> >  
> >    for (int i = 0; SLP_TREE_SCALAR_STMTS (node).iterate (i, &stmt_vinfo); 
> > i++)
> > -    {
> > -      stmt_vinfo = vect_orig_stmt (stmt_vinfo);
> > -      last = last ? get_later_stmt (stmt_vinfo, last) : stmt_vinfo;
> > -    }
> > +    if (stmt_vinfo)
> > +      {
> > +   stmt_vinfo = vect_orig_stmt (stmt_vinfo);
> > +   last = last ? get_later_stmt (stmt_vinfo, last) : stmt_vinfo;
> > +      }
> >  
> >    return last;
> >  }
> > @@ -3046,12 +3052,13 @@ vect_find_first_scalar_stmt_in_slp (slp_tree node)
> >    stmt_vec_info stmt_vinfo;
> >  
> >    for (int i = 0; SLP_TREE_SCALAR_STMTS (node).iterate (i, &stmt_vinfo); 
> > i++)
> > -    {
> > -      stmt_vinfo = vect_orig_stmt (stmt_vinfo);
> > -      if (!first
> > -     || get_later_stmt (stmt_vinfo, first) == first)
> > -   first = stmt_vinfo;
> > -    }
> > +    if (stmt_vinfo)
> > +      {
> > +   stmt_vinfo = vect_orig_stmt (stmt_vinfo);
> > +   if (!first
> > +       || get_later_stmt (stmt_vinfo, first) == first)
> > +     first = stmt_vinfo;
> > +      }
> >  
> >    return first;
> >  }
> > @@ -6211,6 +6218,7 @@ vect_optimize_slp_pass::remove_redundant_permutations 
> > ()
> >         {
> >           if (j != 0
> >               && (next_load_info != load_info
> > +                 || ! load_info
> >                   || DR_GROUP_GAP (load_info) != 1))
> >             {
> >               subchain_p = false;
> > @@ -6827,7 +6835,8 @@ vect_slp_analyze_node_operations_1 (vec_info *vinfo, 
> > slp_tree node,
> >        unsigned int i;
> >        FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, slp_stmt_info)
> >     {
> > -     if (STMT_VINFO_LIVE_P (slp_stmt_info)
> > +     if (slp_stmt_info
> > +         && STMT_VINFO_LIVE_P (slp_stmt_info)
> >           && !vectorizable_live_operation (vinfo, slp_stmt_info, node,
> >                                            node_instance, i,
> >                                            false, cost_vec))
> > @@ -6859,6 +6868,10 @@ vect_slp_convert_to_external (vec_info *vinfo, 
> > slp_tree node,
> >        || VECTOR_BOOLEAN_TYPE_P (SLP_TREE_VECTYPE (node)))
> >      return false;
> >  
> > +  FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt_info)
> > +    if (!stmt_info)
> > +      return false;
> > +
> >    if (dump_enabled_p ())
> >      dump_printf_loc (MSG_NOTE, vect_location,
> >                  "Building vector operands of %p from scalars instead\n",
> > @@ -7217,7 +7230,7 @@ vect_bb_slp_mark_live_stmts (bb_vec_info bb_vinfo, 
> > slp_tree node,
> >    stmt_vec_info last_stmt = vect_find_last_scalar_stmt_in_slp (node);
> >    FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt_info)
> >      {
> > -      if (svisited.contains (stmt_info))
> > +      if (!stmt_info || svisited.contains (stmt_info))
> >     continue;
> >        stmt_vec_info orig_stmt_info = vect_orig_stmt (stmt_info);
> >        if (STMT_VINFO_IN_PATTERN_P (orig_stmt_info)
> > @@ -7406,7 +7419,8 @@ vect_slp_prune_covered_roots (slp_tree node, 
> > hash_set<stmt_vec_info> &roots,
> >    stmt_vec_info stmt;
> >    unsigned i;
> >    FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt)
> > -    roots.remove (vect_orig_stmt (stmt));
> > +    if (stmt)
> > +      roots.remove (vect_orig_stmt (stmt));
> >  
> >    slp_tree child;
> >    FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
> > @@ -7582,8 +7596,9 @@ vect_bb_partition_graph_r (bb_vec_info bb_vinfo,
> >    unsigned i;
> >  
> >    FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt_info)
> > -    vect_map_to_instance (instance, stmt_info, stmt_to_instance,
> > -                     instance_leader);
> > +    if (stmt_info)
> > +      vect_map_to_instance (instance, stmt_info, stmt_to_instance,
> > +                       instance_leader);
> >  
> >    if (vect_map_to_instance (instance, node, node_to_instance,
> >                         instance_leader))
> > @@ -7651,7 +7666,8 @@ vect_slp_gather_vectorized_scalar_stmts (vec_info 
> > *vinfo, slp_tree node,
> >    if (SLP_TREE_DEF_TYPE (node) == vect_internal_def)
> >      {
> >        FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt_info)
> > -   vstmts.add (stmt_info);
> > +   if (stmt_info)
> > +     vstmts.add (stmt_info);
> >  
> >        FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
> >     if (child)
> > @@ -7691,7 +7707,7 @@ vect_bb_slp_scalar_cost (vec_info *vinfo,
> >        ssa_op_iter op_iter;
> >        def_operand_p def_p;
> >  
> > -      if ((*life)[i])
> > +      if (!stmt_info || (*life)[i])
> >     continue;
> >  
> >        stmt_vec_info orig_stmt_info = vect_orig_stmt (stmt_info);
> > @@ -10102,7 +10118,7 @@ vect_schedule_slp_node (vec_info *vinfo,
> >        stmt_vec_info slp_stmt_info;
> >        unsigned int i;
> >        FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, slp_stmt_info)
> > -   if (STMT_VINFO_LIVE_P (slp_stmt_info))
> > +   if (slp_stmt_info && STMT_VINFO_LIVE_P (slp_stmt_info))
> >       {
> >         done = vectorizable_live_operation (vinfo, slp_stmt_info, node,
> >                                             instance, i, true, NULL);
> > @@ -10146,6 +10162,8 @@ vect_remove_slp_scalar_calls (vec_info *vinfo,
> >  
> >    FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt_info)
> >      {
> > +      if (!stmt_info)
> > +   continue;
> >        gcall *stmt = dyn_cast <gcall *> (stmt_info->stmt);
> >        if (!stmt || gimple_bb (stmt) == NULL)
> >     continue;
> > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> > index b12b6ada029..742517a81bd 100644
> > --- a/gcc/tree-vect-stmts.cc
> > +++ b/gcc/tree-vect-stmts.cc
> > @@ -6191,11 +6191,12 @@ vectorizable_shift (vec_info *vinfo,
> >       stmt_vec_info slpstmt_info;
> >  
> >       FOR_EACH_VEC_ELT (stmts, k, slpstmt_info)
> > -       {
> > -         gassign *slpstmt = as_a <gassign *> (slpstmt_info->stmt);
> > -         if (!operand_equal_p (gimple_assign_rhs2 (slpstmt), op1, 0))
> > -           scalar_shift_arg = false;
> > -       }
> > +       if (slpstmt_info)
> > +         {
> > +           gassign *slpstmt = as_a <gassign *> (slpstmt_info->stmt);
> > +           if (!operand_equal_p (gimple_assign_rhs2 (slpstmt), op1, 0))
> > +             scalar_shift_arg = false;
> > +         }
> >  
> >       /* For internal SLP defs we have to make sure we see scalar stmts
> >          for all vector elements.
> > @@ -13095,11 +13096,12 @@ can_vectorize_live_stmts (vec_info *vinfo, 
> > stmt_vec_info stmt_info,
> >        unsigned int i;
> >        FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (slp_node), i, slp_stmt_info)
> >     {
> > -     if ((STMT_VINFO_LIVE_P (slp_stmt_info)
> > -          || (loop_vinfo
> > -              && LOOP_VINFO_EARLY_BREAKS (loop_vinfo)
> > -              && STMT_VINFO_DEF_TYPE (slp_stmt_info)
> > -                   == vect_induction_def))
> > +     if (slp_stmt_info
> > +         && (STMT_VINFO_LIVE_P (slp_stmt_info)
> > +             || (loop_vinfo
> > +                 && LOOP_VINFO_EARLY_BREAKS (loop_vinfo)
> > +                 && STMT_VINFO_DEF_TYPE (slp_stmt_info)
> > +                 == vect_induction_def))
> >           && !vectorizable_live_operation (vinfo, slp_stmt_info, slp_node,
> >                                            slp_node_instance, i,
> >                                            vec_stmt_p, cost_vec))
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to