On 11/17/2017 03:10 PM, Richard Sandiford wrote:
> This is mostly a mechanical extension of the previous gather load
> support to scatter stores. The internal functions in this case are:
>
> IFN_SCATTER_STORE (base, offsets, scale, values)
> IFN_MASK_SCATTER_STORE (base, offsets, scale, values, mask)
>
> However, one nonobvious change is to vect_analyze_data_ref_access.
> If we're treating an access as a gather load or scatter store
> (i.e. if STMT_VINFO_GATHER_SCATTER_P is true), the existing code
> would create a dummy data_reference whose step is 0. There's not
> really much else it could do, since the whole point is that the
> step isn't predictable from iteration to iteration. We then
> went into this code in vect_analyze_data_ref_access:
>
> /* Allow loads with zero step in inner-loop vectorization. */
> if (loop_vinfo && integer_zerop (step))
> {
> GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt)) = NULL;
> if (!nested_in_vect_loop_p (loop, stmt))
> return DR_IS_READ (dr);
>
> I.e. we'd take the step literally and assume that this is a load
> or store to an invariant address. Loads from invariant addresses
> are supported but stores to them aren't.
>
> The code therefore had the effect of disabling all scatter stores.
> AFAICT this is true of AVX too: although tests like avx512f-scatter-1.c
> test for the correctness of a scatter-like loop, they don't seem to
> check whether a scatter instruction is actually used.
>
> The patch therefore makes vect_analyze_data_ref_access return true
> for scatters. We do seem to handle the aliasing correctly;
> that's tested by other functions, and is symmetrical to the
> already-working gather case.
>
> Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu
> and powerpc64le-linux-gnu. OK to install?
>
> Richard
>
>
> 2017-11-17 Richard Sandiford <richard.sandif...@linaro.org>
> Alan Hayward <alan.hayw...@arm.com>
> David Sherwood <david.sherw...@arm.com>
>
> gcc/
> * optabs.def (scatter_store_optab, mask_scatter_store_optab): New
> optabs.
> * doc/md.texi (scatter_store@var{m}, mask_scatter_store@var{m}):
> Document.
> * genopinit.c (main): Add supports_vec_scatter_store and
> supports_vec_scatter_store_cached to target_optabs.
> * gimple.h (gimple_expr_type): Handle IFN_SCATTER_STORE and
> IFN_MASK_SCATTER_STORE.
> * internal-fn.def (SCATTER_STORE, MASK_SCATTER_STORE): New internal
> functions.
> * internal-fn.h (internal_store_fn_p): Declare.
> (internal_fn_stored_value_index): Likewise.
> * internal-fn.c (scatter_store_direct): New macro.
> (expand_scatter_store_optab_fn): New function.
> (direct_scatter_store_optab_supported_p): New macro.
> (internal_store_fn_p): New function.
> (internal_gather_scatter_fn_p): Handle IFN_SCATTER_STORE and
> IFN_MASK_SCATTER_STORE.
> (internal_fn_mask_index): Likewise.
> (internal_fn_stored_value_index): New function.
> (internal_gather_scatter_fn_supported_p): Adjust operand numbers
> for scatter stores.
> * optabs-query.h (supports_vec_scatter_store_p): Declare.
> * optabs-query.c (supports_vec_scatter_store_p): New function.
> * tree-vectorizer.h (vect_get_store_rhs): Declare.
> * tree-vect-data-refs.c (vect_analyze_data_ref_access): Return
> true for scatter stores.
> (vect_gather_scatter_fn_p): Handle scatter stores too.
> (vect_check_gather_scatter): Consider using scatter stores if
> supports_vec_scatter_store_p.
> * tree-vect-patterns.c (vect_try_gather_scatter_pattern): Handle
> scatter stores too.
> * tree-vect-stmts.c (exist_non_indexing_operands_for_use_p): Use
> internal_fn_stored_value_index.
> (check_load_store_masking): Handle scatter stores too.
> (vect_get_store_rhs): Make public.
> (vectorizable_call): Use internal_store_fn_p.
> (vectorizable_store): Handle scatter store internal functions.
> (vect_transform_stmt): Compare GROUP_STORE_COUNT with GROUP_SIZE
> when deciding whether the end of the group has been reached.
> * config/aarch64/aarch64.md (UNSPEC_ST1_SCATTER): New unspec.
> * config/aarch64/aarch64-sve.md (scatter_store<mode>): New expander.
> (mask_scatter_store<mode>): New insns.
>
> gcc/testsuite/
> * gcc.target/aarch64/sve_mask_scatter_store_1.c: New test.
> * gcc.target/aarch64/sve_mask_scatter_store_2.c: Likewise.
> * gcc.target/aarch64/sve_scatter_store_1.c: Likewise.
> * gcc.target/aarch64/sve_scatter_store_2.c: Likewise.
> * gcc.target/aarch64/sve_scatter_store_3.c: Likewise.
> * gcc.target/aarch64/sve_scatter_store_4.c: Likewise.
> * gcc.target/aarch64/sve_scatter_store_5.c: Likewise.
> * gcc.target/aarch64/sve_scatter_store_6.c: Likewise.
> * gcc.target/aarch64/sve_scatter_store_7.c: Likewise.
> * gcc.target/aarch64/sve_strided_store_1.c: Likewise.
> * gcc.target/aarch64/sve_strided_store_2.c: Likewise.
> * gcc.target/aarch64/sve_strided_store_3.c: Likewise.
> * gcc.target/aarch64/sve_strided_store_4.c: Likewise.
> * gcc.target/aarch64/sve_strided_store_5.c: Likewise.
> * gcc.target/aarch64/sve_strided_store_6.c: Likewise.
> * gcc.target/aarch64/sve_strided_store_7.c: Likewise.
OK.
jeff