On 2/26/23 23:14, gaosong wrote:
like this: the vece is MO_32. static void gen_vaddwev_s(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b) { TCGv_vec t1 = tcg_temp_new_vec_matching(a); TCGv_vec t2 = tcg_temp_new_vec_matching(b); int halfbits = 4 << vece; tcg_gen_shli_vec(vece, t1, a, halfbits); tcg_gen_shri_vec(vece, t1, t1, halfbits);tcg_gen_shli_vec(vece, t2, b, halfbits); tcg_gen_shri_vec(vece, t2, t2, halfbits); tcg_gen_add_vec(vece, t, t1, t2); tcg_temp_free_vec(t1); tcg_temp_free_vec(t2); } ... op[MO_16]; { .fniv = gen_vaddwev_s, .fno = gen_helper_vaddwev_w_h, .opt_opc = vecop_list, .vece = MO_32 }, ... TRANS(vaddwev_w_h, gvec_vvv, MO_16, gvec_vaddwev_s) input : 0x ffff fffe ffff fffe ffff fffe ffff fffe + 0 output : 0x 0000 fffe 0000 fffe 0000 fffe 0000 fffe correct is 0xffffffffefffffffefffffffe ffff fffe.
sari above, not shri, for sign-extension. r~
