Richard Biener <rguent...@suse.de> writes: > The following attempts to account for the fact that BB vectorization > regions now can span multiple loop levels and that an unprofitable > inner loop vectorization shouldn't be offsetted by a profitable > outer loop vectorization to make it overall profitable. > > For now I've implemented a heuristic based on the premise that > vectorization should be profitable even if loops may not be entered > or if they iterate any number of times. Especially the first > assumption then requires that stmts directly belonging to loop A > need to be costed separately from stmts belonging to another loop > which also simplifies the implementation. > > On x86 the added testcase has in the outer loop > > t.c:38:20: note: Cost model analysis for part in loop 1: > Vector cost: 56 > Scalar cost: 192 > > and the inner loop > > t.c:38:20: note: Cost model analysis for part in loop 2: > Vector cost: 132 > Scalar cost: 48 > > and thus the vectorization is considered not profitable > (note the same would happen in case the 2nd cost were for > a loop outer to the 1st costing). > > Future enhancements may consider static knowledge of whether > a loop is always entered which would allow some inefficiency > in the vectorization of its loop header. Likewise stmts only > reachable from a loop exit can be treated this way. > > Bootstrapped and tested on x86_64-unknown-linux-gnu and it fixes > the regression reported in the PR. > > Does this look sensible and good enough for GCC 11?
I don't know the new SLP code very well, but FWIW it seems reasonable to me. > +/* Comparator for the loop-index sorted cost vectors. */ > + > +static int > +li_cost_vec_cmp (const void *a_, const void *b_) > +{ > + auto *a = (const std::pair<unsigned, stmt_info_for_cost *> *)a_; > + auto *b = (const std::pair<unsigned, stmt_info_for_cost *> *)b_; > + if (a->first < b->first) > + return -1; > + else if (a->first == b->first) > + return 0; > + return 1; > +} This isn't a stable sort. (Or isn't that an issue these days?) > + /* First produce cost vectors sorted by loop index. */ > + auto_vec<std::pair<unsigned, stmt_info_for_cost *> > > + li_scalar_costs (scalar_costs.length ()); > + auto_vec<std::pair<unsigned, stmt_info_for_cost *> > > + li_vector_costs (vector_costs.length ()); > + FOR_EACH_VEC_ELT (scalar_costs, i, cost) > + { > + unsigned l = gimple_bb (cost->stmt_info->stmt)->loop_father->num; > + li_scalar_costs.quick_push (std::make_pair (l, cost)); > + } > + unsigned l = li_scalar_costs[0].first; Is this just to silence an unused warning? Might be worth a comment if so (although couldn't we just use 0?). > + FOR_EACH_VEC_ELT (vector_costs, i, cost) > + { > + /* We inherit from the previous SI, invariants, externals and > + extracts immediately follow the cost for the related stmt. */ Looks like the comment predates s/SI/COST/. Thanks, Richard