On Tue, Sep 15, 2015 at 10:14:42AM +0100, Alan Lawrence wrote:
> The previous patches leave ld[234]_lane, st[234]_lane, and ld[234]r expanders
> all nearly identical, so we can easily parameterize across the number of
> lanes and combine them.
>
> For the ld<VSTRUCT:nregs>_lane pattern, I switched from the VCONQ attribute
> to just using the MODE attribute, this is identical for all the Q-register
> modes over which we iterate.
>
> bootstrapped and check-gcc on aarch64-none-linux-gnu
OK with a comment added to iterators.md explaining the gotcha you
introduce.
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64-simd.md (aarch64_ld2r<mode>,
> aarch64_ld3r<mode>, aarch64_ld4r<mode>): Combine together, making...
> (aarch64_simd_ld<VSTRUCT:nregs>r<VALLDIF:mode>): ...this.
>
> (aarch64_ld2_lane<mode>, aarch64_ld3_lane<mode>,
> aarch64_ld4_lane<mode>): Combine together, making...
> (aarch64_ld<VSTRUCT:nregs>_lane<VALLDIF:mode>): ...this.
>
> (aarch64_st2_lane<VALLDIF:mode>, aarch64_st3_lane<VQ:mode>,
> aarch64_st4_lane<VQ:mode>): Combine together, making...
> (aarch64_st<VSTRUCT:nregs>_lane<VALLDIF:mode>): ...this.
> ---
> gcc/config/aarch64/aarch64-simd.md | 138
> +++++++------------------------------
> 1 file changed, 23 insertions(+), 115 deletions(-)
>
> diff --git a/gcc/config/aarch64/aarch64-simd.md
> b/gcc/config/aarch64/aarch64-simd.md
> index f239ee7..dbe5259 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -4381,42 +4381,18 @@
> FAIL;
> })
>
> -(define_expand "aarch64_ld2r<mode>"
> - [(match_operand:OI 0 "register_operand" "=w")
> +(define_expand "aarch64_ld<VSTRUCT:nregs>r<VALLDIF:mode>"
> + [(match_operand:VSTRUCT 0 "register_operand" "=w")
> (match_operand:DI 1 "register_operand" "w")
> (unspec:VALLDIF [(const_int 0)] UNSPEC_VSTRUCTDUMMY)]
> "TARGET_SIMD"
> {
> rtx mem = gen_rtx_MEM (BLKmode, operands[1]);
> - set_mem_size (mem, GET_MODE_SIZE (GET_MODE_INNER (<MODE>mode)) * 2);
> + set_mem_size (mem, GET_MODE_SIZE (GET_MODE_INNER (<VALLDIF:MODE>mode))
> + * <VSTRUCT:nregs>);
It is convenient that this falls out, but likely surprising for nregs.
Please add a comment to nregs explaining the dual use of nregs to represent
both the number of Q registers used for the type, and the number of elements
touched by the structure load/store operations.
Thanks,
James