On 12 May 15:55, Jakub Jelinek wrote:
> On Thu, May 12, 2016 at 04:39:52PM +0300, Kirill Yukhin wrote:
> > > --- gcc/config/i386/sse.md.jj     2016-05-04 14:36:08.000000000 +0200
> > > +++ gcc/config/i386/sse.md        2016-05-04 15:16:44.180894303 +0200
> > > @@ -6415,12 +6415,12 @@ (define_insn "avx512f_vec_dup<mode>_1"
> > >  ;; unpcklps with register source since it is shorter.
> > >  (define_insn "*vec_concatv2sf_sse4_1"
> > >    [(set (match_operand:V2SF 0 "register_operand"
> > > -   "=Yr,*x,x,Yr,*x,x,x,*y ,*y")
> > > +   "=Yr,*x,v,Yr,*x,v,v,*y ,*y")
> > >   (vec_concat:V2SF
> > >     (match_operand:SF 1 "nonimmediate_operand"
> > > -   "  0, 0,x, 0,0, x,m, 0 , m")
> > > +   "  0, 0,v, 0,0, v,m, 0 , m")
> > >     (match_operand:SF 2 "vector_move_operand"
> > > -   " Yr,*x,x, m,m, m,C,*ym, C")))]
> > > +   " Yr,*x,v, m,m, m,C,*ym, C")))]
> > >    "TARGET_SSE4_1 && !(MEM_P (operands[1]) && MEM_P (operands[2]))"
> > >    "@
> > >     unpcklps\t{%2, %0|%0, %2}
> > Looks like we were wrong here.
> > We need to use Yv constraint for vunpcklps since this
> > insn is available for AVX-512VL only.
> > 
> > Like this:
> > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> > index d77227a..7d71640 100644
> > --- a/gcc/config/i386/sse.md
> > +++ b/gcc/config/i386/sse.md
> > @@ -6546,12 +6546,12 @@
> >  ;; unpcklps with register source since it is shorter.
> >  (define_insn "*vec_concatv2sf_sse4_1"
> >    [(set (match_operand:V2SF 0 "register_operand"
> > -         "=Yr,*x,v,Yr,*x,v,v,*y ,*y")
> > +         "=Yr,*x,Yv,Yr,*x,v,v,*y ,*y")
> >         (vec_concat:V2SF
> >           (match_operand:SF 1 "nonimmediate_operand"
> > -         "  0, 0,v, 0,0, v,m, 0 , m")
> > +         "  0, 0,Yv, 0,0, v,m, 0 , m")
> >           (match_operand:SF 2 "vector_move_operand"
> > -         " Yr,*x,v, m,m, m,C,*ym, C")))]
> > +         " Yr,*x,Yv, m,m, m,C,*ym, C")))]
> >    "TARGET_SSE4_1 && !(MEM_P (operands[1]) && MEM_P (operands[2]))"
> >    "@
> >     unpcklps\t{%2, %0|%0, %2}
> > 
> > Will check in to main trunk after bootstrap/regtest.
> 
> I'm not sure about the Yv on the operand 0, I think without AVX512VL
> HARD_REGNO_MODE_OK will disallow V2SFmode regs in XMM16+ (but, this
> is MMX-ish mode, so maybe we don't allow it ever in XMM16+).
> On the SFmode operands side, you're right, HARD_REGNO_MODE_OK allows
> SFmode in XMM16+ even for only AVX512F.
Agreed.
> 
>       Jakub

Reply via email to