On Mon, Feb 11, 2019 at 5:11 AM graham stott <graham.st...@btinternet.com> wrote: > > All these patches from HJL have no testcases. Are they even sutable for gcc 9 > at this stage
All my changes are covered by https://gcc.gnu.org/ml/gcc-patches/2019-02/msg00632.html > -------- Original message -------- > From: Uros Bizjak <ubiz...@gmail.com> > Date: 11/02/2019 12:51 (GMT+00:00) > To: "H.J. Lu" <hjl.to...@gmail.com> > Cc: GCC Patches <gcc-patches@gcc.gnu.org> > Subject: Re: [PATCH 12/43] i386: Emulate MMX vec_dupv2si with SSE > > On Mon, Feb 11, 2019 at 1:26 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > On Sun, Feb 10, 2019 at 11:25 PM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > > > On Mon, Feb 11, 2019 at 2:04 AM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > > > > > On Sun, Feb 10, 2019 at 1:49 PM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > > > > > > > On Sun, Feb 10, 2019 at 10:45 PM Uros Bizjak <ubiz...@gmail.com> > > > > > wrote: > > > > > > > > > > > > > > + [(const_int 0)] > > > > > > > > > +{ > > > > > > > > > + /* Emulate MMX vec_dupv2si with SSE vec_dupv4si. */ > > > > > > > > > + rtx op0 = gen_rtx_REG (V4SImode, REGNO (operands[0])); > > > > > > > > > + rtx insn = gen_vec_dupv4si (op0, operands[1]); > > > > > > > > > + emit_insn (insn); > > > > > > > > > + DONE; > > > > > > > > > > > > > > > > Please write this simple RTX explicitly in the place of > > > > > > > > (const_int 0) above. > > > > > > > > > > > > > > rtx insn = gen_vec_dupv4si (op0, operands[1]); > > > > > > > > > > > > > > is easy. How do I write > > > > > > > > > > > > > > rtx op0 = gen_rtx_REG (V4SImode, REGNO (operands[0])); > > > > > > > > > > > > > > in place of (const_int 0)? > > > > > > > > > > > > [(set (match_dup 2) > > > > > > (vec_duplicate:V4SI (match_dup 1)))] > > > > > > > > > > > > with > > > > > > > > > > > > "operands[2] = gen_rtx_REG (V4SImode, REGNO (operands[0]));" > > > > > > > > > > > > or even better: > > > > > > > > > > > > "operands[2] = gen_lowpart (V4SImode, operands[0]);" > > > > > > > > > > > > in the preparation statement. > > > > > > > > > > Even shorter is > > > > > > > > > > "operands[0] = gen_lowpart (V4SImode, operands[0]);" > > > > > > > > > > and use (match_dup 0) instead of (match_dup 2) in the RTX. > > > > > > > > > > There is plenty of examples throughout sse.md. > > > > > > > > > > > > > This works: > > > > > > > > (define_insn_and_split "*vec_dupv2si" > > > > [(set (match_operand:V2SI 0 "register_operand" "=y,x,Yv") > > > > (vec_duplicate:V2SI > > > > (match_operand:SI 1 "register_operand" "0,0,Yv")))] > > > > "TARGET_MMX || TARGET_MMX_WITH_SSE" > > > > "@ > > > > punpckldq\t%0, %0 > > > > # > > > > #" > > > > "TARGET_MMX_WITH_SSE && reload_completed" > > > > [(set (match_dup 0) > > > > (vec_duplicate:V4SI (match_dup 1)))] > > > > "operands[0] = gen_rtx_REG (V4SImode, REGNO (operands[0]));" > > > > [(set_attr "mmx_isa" "native,x64_noavx,x64_avx") > > > > (set_attr "type" "mmxcvt,ssemov,ssemov") > > > > (set_attr "mode" "DI,TI,TI")]) > > > > > > If it works, then gen_lowpart is preferred due to extra checks. > > > However, it would result in a paradoxical subreg, so I wonder if these > > > extra checks allow this transformation. > > > > gen_lowpart dosn't work: > > Ah, we need lowpart_subreg after reload. > > Uros. -- H.J.