On Mon, Aug 16, 2021 at 8:03 AM Kewen.Lin <li...@linux.ibm.com> wrote:
>
> Hi,
>
> IIUC, the function vectorizable_bb_reduc_epilogue missed to
> consider the cost to extract the final value from the vector
> for reduc operations.  This patch is to add one time of
> vec_to_scalar cost for extracting.
>
> Bootstrapped & regtested on powerpc64le-linux-gnu P9.
> The testing on x86_64 and aarch64 is ongoing.
>
> Is it ok for trunk?

There's no such instruction necessary, the way the costing works
the result is in lane zero already.  Note the optabs are defined
to reduce to a scalar already.  So if your arch implements those and
requires such move then the backend costing needs to handle that.

That said, ideally we'd simply cost the IFN_REDUC_* in the backend
but for BB reductions we don't actually build a SLP node with such
representative stmt to pass down (yet).

I guess you're running into a integer reduction where there's
a vector -> gpr move missing in costing?  I suppose costing
vec_to_scalar works for that but in the end we should maybe
find a way to cost the IFN_REDUC_* ...

Richard.

> BR,
> Kewen
> -----
> gcc/ChangeLog:
>
>         * tree-vect-slp.c (vectorizable_bb_reduc_epilogue): Add the cost for
>         value extraction.
>
> diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
> index b9d88c2d943..841a0872afa 100644
> --- a/gcc/tree-vect-slp.c
> +++ b/gcc/tree-vect-slp.c
> @@ -4845,12 +4845,14 @@ vectorizable_bb_reduc_epilogue (slp_instance instance,
>      return false;
>
>    /* There's no way to cost a horizontal vector reduction via REDUC_FN so
> -     cost log2 vector operations plus shuffles.  */
> +     cost log2 vector operations plus shuffles and one extraction.  */
>    unsigned steps = floor_log2 (vect_nunits_for_cost (vectype));
>    record_stmt_cost (cost_vec, steps, vector_stmt, instance->root_stmts[0],
>                     vectype, 0, vect_body);
>    record_stmt_cost (cost_vec, steps, vec_perm, instance->root_stmts[0],
>                     vectype, 0, vect_body);
> +  record_stmt_cost (cost_vec, 1, vec_to_scalar, instance->root_stmts[0],
> +                   vectype, 0, vect_body);
>    return true;
>  }

Reply via email to