On 11/17/2017 08:38 AM, Richard Sandiford wrote:
> This patch adds support for fully-masking loops that require peeling
> for gaps.  It peels exactly one scalar iteration and uses the masked
> loop to handle the rest.  Previously we would fall back on using a
> standard unmasked loop instead.
> 
> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
> and powerpc64le-linux-gnu.  OK to install?
> 
> Richard
> 
> 
> 2017-11-17  Richard Sandiford  <richard.sandif...@linaro.org>
>           Alan Hayward  <alan.hayw...@arm.com>
>           David Sherwood  <david.sherw...@arm.com>
> 
> gcc/
>       * tree-vect-loop-manip.c (vect_gen_scalar_loop_niters): Replace
>       vfm1 with a bound_epilog parameter.
>       (vect_do_peeling): Update calls accordingly, and move the prologue
>       call earlier in the function.  Treat the base bound_epilog as 0 for
>       fully-masked loops and retain vf - 1 for other loops.  Add 1 to
>       this base when peeling for gaps.
>       * tree-vect-loop.c (vect_analyze_loop_2): Allow peeling for gaps
>       with fully-masked loops.
>       (vect_estimate_min_profitable_iters): Handle the single peeled
>       iteration in that case.
> 
> gcc/testsuite/
>       * gcc.target/aarch64/sve_struct_vect_18.c: Check the number
>       of branches.
>       * gcc.target/aarch64/sve_struct_vect_19.c: Likewise.
>       * gcc.target/aarch64/sve_struct_vect_20.c: New test.
>       * gcc.target/aarch64/sve_struct_vect_20_run.c: Likewise.
>       * gcc.target/aarch64/sve_struct_vect_21.c: Likewise.
>       * gcc.target/aarch64/sve_struct_vect_21_run.c: Likewise.
>       * gcc.target/aarch64/sve_struct_vect_22.c: Likewise.
>       * gcc.target/aarch64/sve_struct_vect_22_run.c: Likewise.
>       * gcc.target/aarch64/sve_struct_vect_23.c: Likewise.
>       * gcc.target/aarch64/sve_struct_vect_23_run.c: Likewise.
OK.
jeff

Reply via email to