On Fri, 5 Feb 2021, Richard Sandiford wrote:

> Richard Biener <rguent...@suse.de> writes:
> > The following attempts to account for the fact that BB vectorization
> > regions now can span multiple loop levels and that an unprofitable
> > inner loop vectorization shouldn't be offsetted by a profitable
> > outer loop vectorization to make it overall profitable.
> >
> > For now I've implemented a heuristic based on the premise that
> > vectorization should be profitable even if loops may not be entered
> > or if they iterate any number of times.  Especially the first
> > assumption then requires that stmts directly belonging to loop A
> > need to be costed separately from stmts belonging to another loop
> > which also simplifies the implementation.
> >
> > On x86 the added testcase has in the outer loop
> >
> > t.c:38:20: note: Cost model analysis for part in loop 1:
> >   Vector cost: 56
> >   Scalar cost: 192
> >
> > and the inner loop
> >
> > t.c:38:20: note: Cost model analysis for part in loop 2:
> >   Vector cost: 132
> >   Scalar cost: 48
> >
> > and thus the vectorization is considered not profitable
> > (note the same would happen in case the 2nd cost were for
> > a loop outer to the 1st costing).
> >
> > Future enhancements may consider static knowledge of whether
> > a loop is always entered which would allow some inefficiency
> > in the vectorization of its loop header.  Likewise stmts only
> > reachable from a loop exit can be treated this way.
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu and it fixes
> > the regression reported in the PR.
> >
> > Does this look sensible and good enough for GCC 11?
> 
> I don't know the new SLP code very well, but FWIW it seems reasonable
> to me.
> 
> > +/* Comparator for the loop-index sorted cost vectors.  */
> > +
> > +static int
> > +li_cost_vec_cmp (const void *a_, const void *b_)
> > +{
> > +  auto *a = (const std::pair<unsigned, stmt_info_for_cost *> *)a_;
> > +  auto *b = (const std::pair<unsigned, stmt_info_for_cost *> *)b_;
> > +  if (a->first < b->first)
> > +    return -1;
> > +  else if (a->first == b->first)
> > +    return 0;
> > +  return 1;
> > +}
> 
> This isn't a stable sort.  (Or isn't that an issue these days?)
> 
> > +  /* First produce cost vectors sorted by loop index.  */
> > +  auto_vec<std::pair<unsigned, stmt_info_for_cost *> >
> > +    li_scalar_costs (scalar_costs.length ());
> > +  auto_vec<std::pair<unsigned, stmt_info_for_cost *> >
> > +    li_vector_costs (vector_costs.length ());
> > +  FOR_EACH_VEC_ELT (scalar_costs, i, cost)
> > +    {
> > +      unsigned l = gimple_bb (cost->stmt_info->stmt)->loop_father->num;
> > +      li_scalar_costs.quick_push (std::make_pair (l, cost));
> > +    }
> > +  unsigned l = li_scalar_costs[0].first;
> 
> Is this just to silence an unused warning?  Might be worth a comment if so
> (although couldn't we just use 0?).

The issue is that not all vector_costs entries have a stmt_info so 
this uses a random loop also used in the scalar code (that's going
to be the correct loop in case there's only one loop involved which
is probably 99% of the cases).  I'mm add a comment.

> > +  FOR_EACH_VEC_ELT (vector_costs, i, cost)
> > +    {
> > +      /* We inherit from the previous SI, invariants, externals and
> > +    extracts immediately follow the cost for the related stmt.  */
> 
> Looks like the comment predates s/SI/COST/.

Indeed, will fix and push.

Thanks,
Richard.

Reply via email to