arm: Convert Neon 2-reg-scalar integer multiplies to decodetree

Richard Henderson Thu, 11 Jun 2020 09:13:26 -0700

On 6/11/20 7:45 AM, Peter Maydell wrote:
> Convert the VMLA, VMLS and VMUL insns in the Neon "2 registers and a
> scalar" group to decodetree.  These are 32x32->32 operations where
> one of the inputs is the scalar, followed by a possible accumulate
> operation of the 32-bit result.
> 
> The refactoring removes some of the oddities of the old decoder:
>  * operands to the operation and accumulation were often
>    reversed (taking advantage of the fact that most of these ops
>    are commutative); the new code follows the pseudocode order
>  * the Q bit in the insn was in a local variable 'u'; in the
>    new code it is decoded into a->q
> 
> Signed-off-by: Peter Maydell <peter.mayd...@linaro.org>


Reviewed-by: Richard Henderson <richard.hender...@linaro.org>


> +static void gen_neon_dup_low16(TCGv_i32 var)
> +{
> +    TCGv_i32 tmp = tcg_temp_new_i32();
> +    tcg_gen_ext16u_i32(var, var);
> +    tcg_gen_shli_i32(tmp, var, 16);
> +    tcg_gen_or_i32(var, var, tmp);
> +    tcg_temp_free_i32(tmp);
> +}
> +
> +static void gen_neon_dup_high16(TCGv_i32 var)
> +{
> +    TCGv_i32 tmp = tcg_temp_new_i32();
> +    tcg_gen_andi_i32(var, var, 0xffff0000);
> +    tcg_gen_shri_i32(tmp, var, 16);
> +    tcg_gen_or_i32(var, var, tmp);
> +    tcg_temp_free_i32(tmp);
> +}

I was going to quibble about this implementation, but see that it's a straight
move from translate.c.

The real TODO should be a conversion to tcg_gen_gvec_2s(), so that we use real
vector multiplies and adds here, with the scalar duped across the vector, not
just an i32.


r~

Re: [PATCH 03/10] target/arm: Convert Neon 2-reg-scalar integer multiplies to decodetree

Reply via email to