On Tue, May 7, 2019 at 8:49 AM Hongtao Liu <crazy...@gmail.com> wrote:

> > > > > > > > >     This patch is about to enable support for bfloat16 which 
> > > > > > > > > will be in Future Cooper Lake, Please refer to 
> > > > > > > > > https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference
> > > > > > > > > for more details about BF16.
> > > > > > > > >
> > > > > > > > > There are 3 instructions for AVX512BF16: VCVTNE2PS2BF16, 
> > > > > > > > > VCVTNEPS2BF16 and DPBF16PS instructions, which are Vector 
> > > > > > > > > Neural Network Instructions supporting:
> > > > > > > > >
> > > > > > > > > -       VCVTNE2PS2BF16: Convert Two Packed Single Data to One 
> > > > > > > > > Packed BF16 Data.
> > > > > > > > > -       VCVTNEPS2BF16: Convert Packed Single Data to Packed 
> > > > > > > > > BF16 Data.
> > > > > > > > > -       VDPBF16PS: Dot Product of BF16 Pairs Accumulated into 
> > > > > > > > > Packed Single Precision.
> > > > > > > > >
> > > > > > > > > Since only BF16 intrinsics are supported, we treat it as HI 
> > > > > > > > > for simplicity.
> > > > > > > >
> > > > > > > > I think it was a mistake declaring cvtps2ph and cvtph2ps using 
> > > > > > > > HImode
> > > > > > > > instead of HFmode. Is there a compelling reason not to introduce
> > > > > > > > corresponding bf16_format supporting infrastructure and declare 
> > > > > > > > these
> > > > > > > > intrinsics using half-binary (HBmode ?) mode instead?
> > > > > > > >
> > > > > > > > Uros.
> > > > > > >
> > > > > > > Bfloat16 isn't IEEE standard which we want to reserve HFmode for.
> > > > > >
> > > > > > True.
> > > > > >
> > > > > > > The IEEE 754 standard specifies a binary16 as having the 
> > > > > > > following format:
> > > > > > > Sign bit: 1 bit
> > > > > > > Exponent width: 5 bits
> > > > > > > Significand precision: 11 bits (10 explicitly stored)
> > > > > > >
> > > > > > > Bfloat16 has the following format:
> > > > > > > Sign bit: 1 bit
> > > > > > > Exponent width: 8 bits
> > > > > > > Significand precision: 8 bits (7 explicitly stored), as opposed 
> > > > > > > to 24
> > > > > > > bits in a classical single-precision floating-point format
> > > > > >
> > > > > > This is why I proposed to introduce HBmode (and corresponding
> > > > > > bfloat16_format) to distingush between ieee HFmode and BFmode.
> > > > > >
> > > > >
> > > > > Unless there is BF16 language level support,  HBmode has no advantage
> > > > > over HImode.   We can add HBmode when we gain BF16 language support.
> > > > >
> > > > > --
> > > > > H.J.
> > > >
> > > > Any other comments, I'll merge this to trunk?
> > >
> > > It is not a regression, so please no.
> >
> > Ehm, "regression fix" ...
> >
> > Uros.
>
> Update patch.

Index: gcc/config/i386/i386-builtins.c
===================================================================
--- gcc/config/i386/i386-builtins.c    (revision 270934)
+++ gcc/config/i386/i386-builtins.c    (working copy)
@@ -1920,6 +1920,7 @@
   F_VPCLMULQDQ,
   F_AVX512VNNI,
   F_AVX512BITALG,
+  F_AVX512BF16,
   F_MAX
 };

@@ -2064,7 +2065,8 @@
   {"gfni",    F_GFNI,    P_ZERO},
   {"vpclmulqdq", F_VPCLMULQDQ, P_ZERO},
   {"avx512vnni", F_AVX512VNNI, P_ZERO},
-  {"avx512bitalg", F_AVX512BITALG, P_ZERO}
+  {"avx512bitalg", F_AVX512BITALG, P_ZERO},
+  {"avx512bf16", F_AVX512BF16, P_ZERO}
 };

 /* This parses the attribute arguments to target in DECL and determines

You also need to update cpuinfo.h and cpuinfo.c in libgcc/config/i386
with avx512bf16, plus relevant test files.

Index: gcc/testsuite/gcc.target/i386/avx-1.c
Index: gcc/testsuite/gcc.target/i386/avx-2.c

No need to update above two files, sse-*.c changes are enough to cover
new functionality.

Otherwise LGTM, but please repost updated patch with the ChangeLog
entry (please see [1]).

[1] https://www.gnu.org/software/gcc/contribute.html#patches

Uros.

Reply via email to