On Thu, Dec 14, 2017 at 12:43:11AM +0000, Jeff Law wrote:
> On 11/22/2017 11:10 AM, Richard Sandiford wrote:
> > Richard Sandiford <richard.sandif...@linaro.org> writes:
> >> Two things stopped us using SLP reductions with variable-length vectors:
> >>
> >> (1) We didn't have a way of constructing the initial vector.
> >>     This patch does it by creating a vector full of the neutral
> >>     identity value and then using a shift-and-insert function
> >>     to insert any non-identity inputs into the low-numbered elements.
> >>     (The non-identity values are needed for double reductions.)
> >>     Alternatively, for unchained MIN/MAX reductions that have no neutral
> >>     value, we instead use the same duplicate-and-interleave approach as
> >>     for SLP constant and external definitions (added by a previous
> >>     patch).
> >>
> >> (2) The epilogue for constant-length vectors would extract the vector
> >>     elements associated with each SLP statement and do scalar arithmetic
> >>     on these individual elements.  For variable-length vectors, the patch
> >>     instead creates a reduction vector for each SLP statement, replacing
> >>     the elements for other SLP statements with the identity value.
> >>     It then uses a hardware reduction instruction on each vector.
> >>
> >> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
> >> and powerpc64le-linux-gnu.
> > 
> > Here's an updated version that applies on top of the recent
> > removal of REDUC_*_EXPR.  Tested as before.
> > 
> > Thanks,
> > Richard
> > 
> > 
> > 2017-11-22  Richard Sandiford  <richard.sandif...@linaro.org>
> >         Alan Hayward  <alan.hayw...@arm.com>
> >         David Sherwood  <david.sherw...@arm.com>
> > 
> > gcc/
> >     * doc/md.texi (vec_shl_insert_@var{m}): New optab.
> >     * internal-fn.def (VEC_SHL_INSERT): New internal function.
> >     * optabs.def (vec_shl_insert_optab): New optab.
> >     * tree-vectorizer.h (can_duplicate_and_interleave_p): Declare.
> >     (duplicate_and_interleave): Likewise.
> >     * tree-vect-loop.c: Include internal-fn.h.
> >     (neutral_op_for_slp_reduction): New function, split out from
> >     get_initial_defs_for_reduction.
> >     (get_initial_def_for_reduction): Handle option 2 for variable-length
> >     vectors by loading the neutral value into a vector and then shifting
> >     the initial value into element 0.
> >     (get_initial_defs_for_reduction): Replace the code argument with
> >     the neutral value calculated by neutral_op_for_slp_reduction.
> >     Use gimple_build_vector for constant-length vectors.
> >     Use IFN_VEC_SHL_INSERT for variable-length vectors if all
> >     but the first group_size elements have a neutral value.
> >     Use duplicate_and_interleave otherwise.
> >     (vect_create_epilog_for_reduction): Take a neutral_op parameter.
> >     Update call to get_initial_defs_for_reduction.  Handle SLP
> >     reductions for variable-length vectors by creating one vector
> >     result for each scalar result, with the elements associated
> >     with other scalar results stubbed out with the neutral value.
> >     (vectorizable_reduction): Call neutral_op_for_slp_reduction.
> >     Require IFN_VEC_SHL_INSERT for double reductions on
> >     variable-length vectors, or SLP reductions that have
> >     a neutral value.  Require can_duplicate_and_interleave_p
> >     support for variable-length unchained SLP reductions if there
> >     is no neutral value, such as for MIN/MAX reductions.  Also require
> >     the number of vector elements to be a multiple of the number of
> >     SLP statements when doing variable-length unchained SLP reductions.
> >     Update call to vect_create_epilog_for_reduction.
> >     * tree-vect-slp.c (can_duplicate_and_interleave_p): Make public
> >     and remove initial values.
> >     (duplicate_and_interleave): Use IFN_VEC_SHL_INSERT for
> >     variable-length vectors if all but the first group_size elements
> >     have a neutral value.
> >     * config/aarch64/aarch64.md (UNSPEC_INSR): New unspec.
> >     * config/aarch64/aarch64-sve.md (vec_shl_insert_<mode>): New insn.
> > 
> > gcc/testsuite/
> >     * gcc.dg/vect/pr37027.c: Remove XFAIL for variable-length vectors.
> >     * gcc.dg/vect/pr67790.c: Likewise.
> >     * gcc.dg/vect/slp-reduc-1.c: Likewise.
> >     * gcc.dg/vect/slp-reduc-2.c: Likewise.
> >     * gcc.dg/vect/slp-reduc-3.c: Likewise.
> >     * gcc.dg/vect/slp-reduc-5.c: Likewise.
> >     * gcc.target/aarch64/sve_slp_5.c: New test.
> >     * gcc.target/aarch64/sve_slp_5_run.c: Likewise.
> >     * gcc.target/aarch64/sve_slp_6.c: Likewise.
> >     * gcc.target/aarch64/sve_slp_6_run.c: Likewise.
> >     * gcc.target/aarch64/sve_slp_7.c: Likewise.
> >     * gcc.target/aarch64/sve_slp_7_run.c: Likewise.
> OK
> jeff

As you explicitly asked on another thread, this is OK from an AArch64
maintainer too.

James

Reply via email to