On Fri, 2020-06-12 at 18:31 -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Jun 11, 2020 at 11:22:33PM -0500, will schmidt wrote:
> >   Fix codegen implementation for the builtin
> > vec_pack_to_short_fp32.
> > +;; Convert two vector F32 to packed vector f16.
> > +(define_expand "convert_4f32_8f16"
> > +  [(set (match_operand:V8HI 0 "register_operand" "=v")
> > +   (unspec:V8HI [(match_operand:V4SF 1 "register_operand" "v")
> > +                 (match_operand:V4SF 2 "register_operand" "v")]
> > +                UNSPEC_CONVERT_4F32_8F16))]
> > +  "TARGET_P9_VECTOR"
> > +{
> > +  rtx rtx_tmp_hi = gen_reg_rtx (V4SImode);
> > +  rtx rtx_tmp_lo = gen_reg_rtx (V4SImode);
> > +
> > +  emit_insn (gen_vsx_xvcvsphp (rtx_tmp_hi, operands[1] ));
> > +  emit_insn (gen_vsx_xvcvsphp (rtx_tmp_lo, operands[2] ));
> > +  if (BYTES_BIG_ENDIAN)
> > +    emit_insn (gen_altivec_vpkuwum (operands[0], rtx_tmp_lo,
> > rtx_tmp_hi));
> > +  else
> > +    emit_insn (gen_altivec_vpkuwum (operands[0], rtx_tmp_hi,
> > rtx_tmp_lo));
> > +  DONE;
> > +})
> 
> Why is this different from the 8i16, which doesn't have the swap of
> the
> operands for BE?

The 8f16 implementation specifies the swap variation between LE/BE. 
I'm not sure of the 8i16 implementation was defined without that, or
if this was just missed.  I suspect it should have been there too.

> 
> If the comment before the pattern would say how it orders the
> elements,
> that might clarify things.  (This really is documenting the unspec,
> but
> there is no better place for documenting this afaics).

Ok.  I'll double-check and ensure i've got the right wording. 

> 
> > +;; Generate xvcvsphp
> > +(define_insn "vsx_xvcvsphp"
> > +  [(set (match_operand:V4SI 0 "register_operand" "=wa")
> > +        (unspec:V4SI [(match_operand:V4SF 1 "vsx_register_operand"
> > "wa")]
> > +                UNSPEC_VSX_XVCVSPHP))]
> 
> (Should be a tab before the unspec as well, not eight spaces).

I see a tab and 5 spaces.   I'll double-check. 

> 
> > --- a/gcc/testsuite/gcc.target/powerpc/builtins-1-p9-runnable.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/builtins-1-p9-runnable.c
> > @@ -1,25 +1,37 @@
> >  /* { dg-do run { target { powerpc*-*-linux* && { lp64 &&
> > p9vector_hw } } } } */
> 
> Why the lp64 here?
> 
> >  /* { dg-require-effective-target powerpc_p9vector_ok } */
> 
> That is implied by p9vector_hw (or it should be, at least :-) )

Hmm, yes.  Those are issues with the original test.  I'll update.

Thanks
-Will

> 
> 
> Segher

Reply via email to