On Mon, 18 Nov 2019, Tamar Christina wrote:

> Hi Richi,
> 
> Thanks for the feedback, if you wouldn't mind indulging me quickly (for the 
> version for next stage-1)
> 
> The original approach we had was indeed using a vec-pattern which turned
> 
> best_i_11 = _4 > best_26 ? i_27 : best_i_25;
> best_13 = MAX_EXPR <_4, best_26>;
> 
> into
> 
> best_13 = MAX_EXPR <_4, best_26>;
> best_i_11 = _4 == best_13 ? best_i_25 : i_27;

Ah, interesting way to rewrite this indeed (didn't think of it).  I
think for FP compares this might be a bit iffy though.

> which was:
> 
> static gimple *
> vect_recog_minmax_index_pattern (stmt_vec_info stmt_vinfo, tree *type_out)
> {
>   tree oprnd0, oprnd1;
>   gimple *last_stmt = stmt_vinfo->stmt;
>   vec_info *vinfo = stmt_vinfo->vinfo;
>   tree type;
>   gimple *use_stmt;
>   imm_use_iterator iter;
>   gimple_stmt_iterator gsi;
> 
>   /* Starting from LAST_STMT, follow the defs of its uses in search
>      of the above pattern.  */
> 
>   if (!vect_reassociating_reduction_simple_p (stmt_vinfo, MAX_EXPR,
>                                             &oprnd0, &oprnd1))
>     return NULL;
> 
>   type = gimple_expr_type (last_stmt);
> 
>   if (!is_a <gphi *> (SSA_NAME_DEF_STMT (oprnd1)))
>     return NULL;
> 
>   stmt_vec_info phy_vinfo = vinfo->lookup_def (oprnd1);
>   if (!phy_vinfo)
>     return NULL;
> 
>   basic_block top_bb = gimple_bb (last_stmt);
>   bool found_p = false;
> 
>   FOR_EACH_IMM_USE_STMT (use_stmt, iter, oprnd1)
>     {
>       if (is_gimple_assign (use_stmt)
>         && gimple_assign_rhs_code (use_stmt) == COND_EXPR)
>       {
>         basic_block bb = gimple_bb (use_stmt);
> 
>         if (bb == top_bb
>             && gimple_uid (use_stmt) < gimple_uid (last_stmt))
>           {
>             tree cond = gimple_assign_rhs1 (use_stmt);
>             if (TREE_CODE (cond) != GT_EXPR)
>               continue;
> 
>             tree true_b = gimple_assign_rhs2 (use_stmt);
>             tree false_b = gimple_assign_rhs3 (use_stmt);
>             TREE_SET_CODE (cond, NE_EXPR);
>             TREE_OPERAND (cond, 1) = gimple_assign_lhs (last_stmt);
>             gimple_set_op (use_stmt, 3, false_b);
>             gimple_set_op (use_stmt, 2, true_b);
> 
>             gimple_seq *def_seq = &STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo);
>             gsi = gsi_for_stmt (use_stmt, def_seq);
>             gsi_remove (&gsi, false);
>             gsi = gsi_for_stmt (last_stmt, def_seq);
>             gimple_set_bb (use_stmt, bb);
>             gsi_insert_after (&gsi, use_stmt, GSI_SAME_STMT);
> 
>             vect_pattern_detected ("vect_recog_minmax_index_pattern",
>                                    last_stmt);
>             found_p = true;
>           }
>       }
>     }
> 
>   if (found_p)
>     {
>         STMT_VINFO_DEF_TYPE (phy_vinfo) = vect_min_max_reduction_def;
>     }
> 
>   return NULL;
> }
> 
> So it's rewriting the statement into one where the recursive phi node is 
> only used once And marking the reduction as a vect_min_max_reduction_def 
> to handle it correctly later on.

The main issue you likely run into is that cycle detection runs before
pattern detection and cycle detection already fails.

> There was two things that bugged be about it though. Returning NULL from the 
> vect-pattern felt odd. 
> We could return the last statement here as a the pattern, but since we also 
> have to re-order the statements in the BB it seemed better not to treat it as 
> a simple pattern.
> This made me think it really didn't belong here and that it's not really a 
> vectorizer pattern, but rather a work around.
> 
> Maybe this should be done in ifconv? Where the conversions are created 
> in the first place?

That's also a possibility though the IL is valid also before if-conversion
so if-conversion might end up doing nothing.

> Also doing this seemed to require a lot of generic boiler plate changes 
> to support vect_min_max_reduction_def, which also seemed a bit messy.

Well, that I'd simply defer to vectorizable_reduction which walks
the discovered cycle and computes STMT_VINFO_REDUC_TYPE
(COND_REDUCTION, etc.).

> Any thoughts here? 
> 
> On a side note, this code in vect_create_epilog_for_reduction
> 
>           for (elt_offset = nelements / 2;
>                elt_offset >= 1;
>                elt_offset /= 2)
>             {
>             calc_vec_perm_mask_for_shift (elt_offset, nelements, &sel);
>             indices.new_vector (sel, 2, nelements);
>             tree mask = vect_gen_perm_mask_any (vectype1, indices);
>             new_name = gimple_build (&stmts, VEC_PERM_EXPR, vectype1,
>                                      new_temp, zero_vec, mask);
>             new_temp = gimple_build (&stmts, code,
>                                      vectype1, new_name, new_temp);
>             }
>            gsi_insert_seq_before (&exit_gsi, stmts, GSI_SAME_STMT);
> 
>            .....
> 
>         rhs = build3 (BIT_FIELD_REF, scalar_type, new_temp,
>                       bitsize, bitsize_zero_node);
>                epilog_stmt = gimple_build_assign (new_scalar_dest, rhs);
> 
> Seems like a very roundabout way of doing a reduction.. unless I have 
> misunderstood something?
> If "code" is PLUS, MIN, MAX etc wouldn't a reduction call here be better if 
> the target supports it?  This is quite hard to patch up later in combine 
> particularly for byte of hf modes where the number of statements you'd need 
> to match exceed combine's maximum.

Yes, I think that's already done - vectorizable_reduction checks for
such supported epilogue operation.

So I think the main issue right now is that vect_is_simple_reduction
needs to identify the beast as a single reduction.

I think it's kind-of a COND_REDUCTION where the induction PHI is already
in the original source.

Richard.

> Thanks,
> Tamar
> 
> > -----Original Message-----
> > From: Richard Biener <rguent...@suse.de>
> > Sent: Monday, November 18, 2019 11:24
> > To: Joel Hutton <joel.hut...@arm.com>
> > Cc: GCC Patches <gcc-patches@gcc.gnu.org>; Tamar Christina
> > <tamar.christ...@arm.com>; nd <n...@arm.com>
> > Subject: Re: [RFC][GCC][AArch64] Add minmax phi-reduction pattern
> > 
> > On Fri, 15 Nov 2019, Joel Hutton wrote:
> > 
> > > Forgot to CC maintainer.
> > > On 15/11/2019 18:03, Joel wrote:
> > > > Hi all,
> > > >
> > > > Just looking for some feedback on the approach.
> > > >
> > > > Currently the loop vectorizer can't vectorize the following typical
> > > > loop for getting max value and index from an array:
> > > >
> > > > void test_vec(int *data, int n) {
> > > >         int best_i, best = 0;
> > > >
> > > >         for (int i = 0; i < n; i++) {
> > > >                 if (data[i] > best) {
> > > >                         best = data[i];
> > > >                         best_i = i;
> > > >                 }
> > > >         }
> > > >
> > > >         data[best_i] = data[0];
> > > >         data[0] = best;
> > > > }
> > > >
> > > > This patch adds some support for this pattern.
> > > >
> > > > This patch addresses Bug 88259.
> > > >
> > > > Regression testing is still in progress, This patch does not work
> > > > correctly with vect-epilogues-nomask, and the reason for that is
> > > > still being investigated.
> > 
> > Quick posting before stage1 ends, heh?
> > 
> > New functions lack comments, new fields in stmt_info as well, accesses
> > should go through (missing) STMT_VINFO_* macros.
> > 
> > You are removing a safety check without replacement elsewhere:
> > 
> > -      /* Check there's only a single stmt the op is used on inside
> > -         of the loop.  */
> > -      imm_use_iterator imm_iter;
> > -      gimple *op_use_stmt;
> > -      unsigned cnt = 0;
> > -      FOR_EACH_IMM_USE_STMT (op_use_stmt, imm_iter, op)
> > -       if (!is_gimple_debug (op_use_stmt)
> > -           && flow_bb_inside_loop_p (loop, gimple_bb (op_use_stmt)))
> > -         cnt++;
> > -      if (cnt != 1)
> > -       {
> > -         fail = true;
> > -         break;
> > -       }
> > 
> > AFAICS you still analyze both PHIs but the correct thing to do here is to 
> > view
> > both SSA cycles as a single one - when analyzing
> > 
> >   # best_i_25 = PHI <best_i_11(8), best_i_16(D)(18)>
> >   # best_26 = PHI <best_13(8), 0(18)>
> > ...
> >   best_i_11 = _4 <= best_26 ? best_i_25 : i_27;
> >   best_13 = MAX_EXPR <_4, best_26>;
> > 
> > and starting from the best_i_25 PHI nothing prevents check_reduction_path
> > SCC finding to walk to the best_26 cycle but luck(?).
> > 
> > And the sanity check should be that all uses are within the detected cycle
> > (which then would be the case).
> > 
> > So your handling looks like an afterthought - exactly going backwards of my
> > attempts to make the code better structured :/
> > 
> > The code-gen you do in vect_create_epilog_for_reduction needs a comment
> > - vect_create_epilog_for_reduction will be called twice in unspecified 
> > order,
> > so clearly unconditionally accessing stmt_info->reduc_related_stmt-
> > >reduc_minmax_epilog_stmt
> > (if it is what it looks like...) is not going to work.  You are also using
> > IFN_REDUC_MAX which possibly isn't available.
> > 
> > So I think what you want to do is "detect" the SCC with two PHIs, then -
> > maybe in tree-vect-patterns replace it with a single PHI and a single
> > operation, or somehow mangle it into a SLP tree to make it a single 
> > reduction.
> > 
> > So please see to all this for next stage1.
> > 
> > Thanks,
> > Richard.
> > 
> > > > gcc/ChangeLog:
> > > >
> > > >
> > > > 2019-11-15  Joel Hutton  <joel.hut...@arm.com>
> > > >         Tamar Christina  <tamar.christ...@arm.com>
> > > >
> > > >     PR tree-optimization/88259
> > > >     * tree-vect-loop.c (vect_reassociating_reduction_simple_p): New
> > > > function.
> > > >     (vect_recog_minmax_index_pattern): New function.
> > > >     (vect_is_simple_reduction): Add check for minmax pattern.
> > > >     (vect_model_reduction_cost): Add case for minmax pattern.
> > > >     (vect_create_epilog_for_reduction): Add fixup for minmax epilog.
> > > >     * tree-vectorizer.h (enum vect_reduction_type): Add
> > > > INDEX_MINMAX_REDUCTION reduction type.
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Reply via email to