On 11/17/2017 08:38 AM, Richard Sandiford wrote:
> This patch adds support for fully-masking loops that require peeling
> for gaps. It peels exactly one scalar iteration and uses the masked
> loop to handle the rest. Previously we would fall back on using a
> standard unmasked loop instead.
>
> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
> and powerpc64le-linux-gnu. OK to install?
>
> Richard
>
>
> 2017-11-17 Richard Sandiford <richard.sandif...@linaro.org>
> Alan Hayward <alan.hayw...@arm.com>
> David Sherwood <david.sherw...@arm.com>
>
> gcc/
> * tree-vect-loop-manip.c (vect_gen_scalar_loop_niters): Replace
> vfm1 with a bound_epilog parameter.
> (vect_do_peeling): Update calls accordingly, and move the prologue
> call earlier in the function. Treat the base bound_epilog as 0 for
> fully-masked loops and retain vf - 1 for other loops. Add 1 to
> this base when peeling for gaps.
> * tree-vect-loop.c (vect_analyze_loop_2): Allow peeling for gaps
> with fully-masked loops.
> (vect_estimate_min_profitable_iters): Handle the single peeled
> iteration in that case.
>
> gcc/testsuite/
> * gcc.target/aarch64/sve_struct_vect_18.c: Check the number
> of branches.
> * gcc.target/aarch64/sve_struct_vect_19.c: Likewise.
> * gcc.target/aarch64/sve_struct_vect_20.c: New test.
> * gcc.target/aarch64/sve_struct_vect_20_run.c: Likewise.
> * gcc.target/aarch64/sve_struct_vect_21.c: Likewise.
> * gcc.target/aarch64/sve_struct_vect_21_run.c: Likewise.
> * gcc.target/aarch64/sve_struct_vect_22.c: Likewise.
> * gcc.target/aarch64/sve_struct_vect_22_run.c: Likewise.
> * gcc.target/aarch64/sve_struct_vect_23.c: Likewise.
> * gcc.target/aarch64/sve_struct_vect_23_run.c: Likewise.
OK.
jeff