On Fri, 2020-06-12 at 18:31 -0500, Segher Boessenkool wrote:
> Hi!
>
> On Thu, Jun 11, 2020 at 11:22:33PM -0500, will schmidt wrote:
> > Fix codegen implementation for the builtin
> > vec_pack_to_short_fp32.
> > +;; Convert two vector F32 to packed vector f16.
> > +(define_expand "convert_4f32_8f16"
> > + [(set (match_operand:V8HI 0 "register_operand" "=v")
> > + (unspec:V8HI [(match_operand:V4SF 1 "register_operand" "v")
> > + (match_operand:V4SF 2 "register_operand" "v")]
> > + UNSPEC_CONVERT_4F32_8F16))]
> > + "TARGET_P9_VECTOR"
> > +{
> > + rtx rtx_tmp_hi = gen_reg_rtx (V4SImode);
> > + rtx rtx_tmp_lo = gen_reg_rtx (V4SImode);
> > +
> > + emit_insn (gen_vsx_xvcvsphp (rtx_tmp_hi, operands[1] ));
> > + emit_insn (gen_vsx_xvcvsphp (rtx_tmp_lo, operands[2] ));
> > + if (BYTES_BIG_ENDIAN)
> > + emit_insn (gen_altivec_vpkuwum (operands[0], rtx_tmp_lo,
> > rtx_tmp_hi));
> > + else
> > + emit_insn (gen_altivec_vpkuwum (operands[0], rtx_tmp_hi,
> > rtx_tmp_lo));
> > + DONE;
> > +})
>
> Why is this different from the 8i16, which doesn't have the swap of
> the
> operands for BE?
The 8f16 implementation specifies the swap variation between LE/BE.
I'm not sure of the 8i16 implementation was defined without that, or
if this was just missed. I suspect it should have been there too.
>
> If the comment before the pattern would say how it orders the
> elements,
> that might clarify things. (This really is documenting the unspec,
> but
> there is no better place for documenting this afaics).
Ok. I'll double-check and ensure i've got the right wording.
>
> > +;; Generate xvcvsphp
> > +(define_insn "vsx_xvcvsphp"
> > + [(set (match_operand:V4SI 0 "register_operand" "=wa")
> > + (unspec:V4SI [(match_operand:V4SF 1 "vsx_register_operand"
> > "wa")]
> > + UNSPEC_VSX_XVCVSPHP))]
>
> (Should be a tab before the unspec as well, not eight spaces).
I see a tab and 5 spaces. I'll double-check.
>
> > --- a/gcc/testsuite/gcc.target/powerpc/builtins-1-p9-runnable.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/builtins-1-p9-runnable.c
> > @@ -1,25 +1,37 @@
> > /* { dg-do run { target { powerpc*-*-linux* && { lp64 &&
> > p9vector_hw } } } } */
>
> Why the lp64 here?
>
> > /* { dg-require-effective-target powerpc_p9vector_ok } */
>
> That is implied by p9vector_hw (or it should be, at least :-) )
Hmm, yes. Those are issues with the original test. I'll update.
Thanks
-Will
>
>
> Segher