Joel Hutton <[email protected]> writes:
> Hi all,
>
> This patch fixes PR99102. For masked gather loads/scatter stores the
> 'loop_masks' variable was checked to be non-null, but the 'final_mask' was the
> actual mask used.
>
> bootstrapped and regression tested on aarch64. Regression tested on
> aarch64_sve
> under qemu.
>
> [Vect] Fix mask check on Scatter loads/stores
>
> Previously, IFN_MASK_SCATTER_STORE was used if 'loop_masks' was
> non-null, but the mask used is 'final_mask'. This caused a bug where
> a 'MASK_STORE' was vectorized into a 'SCATTER_STORE' instead of a
> 'MASK_SCATTER_STORE'. This fixes PR target/99102.
>
> gcc/ChangeLog:
>
> PR target/99102
> * tree-vect-stmts.c (vectorizable_store): Fix scatter store mask
> check condition.
> (vectorizable_load): Fix gather load mask check condition.
>
> gcc/testsuite/ChangeLog:
>
> PR target/99102
> * gcc.target/aarch64/sve/pr99102.c: New test.
(The filename is out of date, but the git hook would catch that.)
>
> diff --git a/gcc/testsuite/gcc.dg/vect/pr99102.c
> b/gcc/testsuite/gcc.dg/vect/pr99102.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..9d321b97a68c05ad08646e8e2d6979c2030c65e6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/pr99102.c
> @@ -0,0 +1,20 @@
> +/* { dg-options "-O2 -ftree-vectorize" } */
> +/* { dg-additional-options "-msve-vector-bits=256" { target aarch64_sve } }
> */
I should have realised this earlier, sorry, but I think we want
this to be conditional on aarch64_sve256_hw instead. Same for the
scan-tree-dump below.
When running the testsuite on SVE hardware, the above would force
a VL of 256 even if the actual VL is different, so the test would
fail at runtime for 128-bit SVE, 512-bit SVE, etc.
Thanks,
Richard.
> +long a[44];
> +short d, e = -7;
> +__attribute__((noipa)) void b(char f, short j, short k, unsigned l) {
> + for (int g = 0; g < 9; g += f)
> + for (int b = 0; b < 90; b -= k)
> + for (int h = 0; h < f; h++)
> + for (short i = 0; i < 15; i += 4)
> + if (!l)
> + a[i] = j;
> +}
> +int main() {
> + for (long c = 0; c < 2; ++c)
> + a[c] = 7;
> + b(9, d, e, 5);
> + if (!a[0])
> + __builtin_abort();
> +}
> +/* { dg-final { scan-tree-dump "MASK_SCATTER_STORE" "vect" { target
> aarch64_sve } } } */
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index
> 85d3161fe3b2fbe289396457361c7af7519bd2b3..d791d3a47205a6f105e25df5fa22e403d30638eb
> 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -8126,7 +8126,7 @@ vectorizable_store (vec_info *vinfo,
> {
> tree scale = size_int (gs_info.scale);
> gcall *call;
> - if (loop_masks)
> + if (final_mask)
> call = gimple_build_call_internal
> (IFN_MASK_SCATTER_STORE, 5, dataref_ptr, vec_offset,
> scale, vec_oprnd, final_mask);
> @@ -9419,7 +9419,7 @@ vectorizable_load (vec_info *vinfo,
> tree zero = build_zero_cst (vectype);
> tree scale = size_int (gs_info.scale);
> gcall *call;
> - if (loop_masks)
> + if (final_mask)
> call = gimple_build_call_internal
> (IFN_MASK_GATHER_LOAD, 5, dataref_ptr,
> vec_offset, scale, zero, final_mask);