Richard Biener <[email protected]> writes:
> The following attempts to account for the fact that BB vectorization
> regions now can span multiple loop levels and that an unprofitable
> inner loop vectorization shouldn't be offsetted by a profitable
> outer loop vectorization to make it overall profitable.
>
> For now I've implemented a heuristic based on the premise that
> vectorization should be profitable even if loops may not be entered
> or if they iterate any number of times. Especially the first
> assumption then requires that stmts directly belonging to loop A
> need to be costed separately from stmts belonging to another loop
> which also simplifies the implementation.
>
> On x86 the added testcase has in the outer loop
>
> t.c:38:20: note: Cost model analysis for part in loop 1:
> Vector cost: 56
> Scalar cost: 192
>
> and the inner loop
>
> t.c:38:20: note: Cost model analysis for part in loop 2:
> Vector cost: 132
> Scalar cost: 48
>
> and thus the vectorization is considered not profitable
> (note the same would happen in case the 2nd cost were for
> a loop outer to the 1st costing).
>
> Future enhancements may consider static knowledge of whether
> a loop is always entered which would allow some inefficiency
> in the vectorization of its loop header. Likewise stmts only
> reachable from a loop exit can be treated this way.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu and it fixes
> the regression reported in the PR.
>
> Does this look sensible and good enough for GCC 11?
I don't know the new SLP code very well, but FWIW it seems reasonable
to me.
> +/* Comparator for the loop-index sorted cost vectors. */
> +
> +static int
> +li_cost_vec_cmp (const void *a_, const void *b_)
> +{
> + auto *a = (const std::pair<unsigned, stmt_info_for_cost *> *)a_;
> + auto *b = (const std::pair<unsigned, stmt_info_for_cost *> *)b_;
> + if (a->first < b->first)
> + return -1;
> + else if (a->first == b->first)
> + return 0;
> + return 1;
> +}
This isn't a stable sort. (Or isn't that an issue these days?)
> + /* First produce cost vectors sorted by loop index. */
> + auto_vec<std::pair<unsigned, stmt_info_for_cost *> >
> + li_scalar_costs (scalar_costs.length ());
> + auto_vec<std::pair<unsigned, stmt_info_for_cost *> >
> + li_vector_costs (vector_costs.length ());
> + FOR_EACH_VEC_ELT (scalar_costs, i, cost)
> + {
> + unsigned l = gimple_bb (cost->stmt_info->stmt)->loop_father->num;
> + li_scalar_costs.quick_push (std::make_pair (l, cost));
> + }
> + unsigned l = li_scalar_costs[0].first;
Is this just to silence an unused warning? Might be worth a comment if so
(although couldn't we just use 0?).
> + FOR_EACH_VEC_ELT (vector_costs, i, cost)
> + {
> + /* We inherit from the previous SI, invariants, externals and
> + extracts immediately follow the cost for the related stmt. */
Looks like the comment predates s/SI/COST/.
Thanks,
Richard