On 6/11/20 7:45 AM, Peter Maydell wrote: > Convert the VMLA, VMLS and VMUL insns in the Neon "2 registers and a > scalar" group to decodetree. These are 32x32->32 operations where > one of the inputs is the scalar, followed by a possible accumulate > operation of the 32-bit result. > > The refactoring removes some of the oddities of the old decoder: > * operands to the operation and accumulation were often > reversed (taking advantage of the fact that most of these ops > are commutative); the new code follows the pseudocode order > * the Q bit in the insn was in a local variable 'u'; in the > new code it is decoded into a->q > > Signed-off-by: Peter Maydell <peter.mayd...@linaro.org>
Reviewed-by: Richard Henderson <richard.hender...@linaro.org> > +static void gen_neon_dup_low16(TCGv_i32 var) > +{ > + TCGv_i32 tmp = tcg_temp_new_i32(); > + tcg_gen_ext16u_i32(var, var); > + tcg_gen_shli_i32(tmp, var, 16); > + tcg_gen_or_i32(var, var, tmp); > + tcg_temp_free_i32(tmp); > +} > + > +static void gen_neon_dup_high16(TCGv_i32 var) > +{ > + TCGv_i32 tmp = tcg_temp_new_i32(); > + tcg_gen_andi_i32(var, var, 0xffff0000); > + tcg_gen_shri_i32(tmp, var, 16); > + tcg_gen_or_i32(var, var, tmp); > + tcg_temp_free_i32(tmp); > +} I was going to quibble about this implementation, but see that it's a straight move from translate.c. The real TODO should be a conversion to tcg_gen_gvec_2s(), so that we use real vector multiplies and adds here, with the scalar duped across the vector, not just an i32. r~