Re: [PATCH #2], Define __FP_FAST_FMAF128 on PowerPC ISA 3.0

Michael Meissner Mon, 02 Oct 2017 16:54:13 -0700

Whoops, I forgot to attach the patch.

On Mon, Oct 02, 2017 at 07:51:00PM -0400, Michael Meissner wrote:
> On Thu, Sep 28, 2017 at 12:40:24AM +0000, Joseph Myers wrote:
> > On Wed, 27 Sep 2017, Michael Meissner wrote:
> > 
> > > The glibc team has requested we define the standard macro 
> > > (__FP_FAST_FMAF128)
> > > for PowerPC code when we have the IEEE 128-bit floating point hardware
> > > instructions enabled.
> > 
> > It's not a standard macro.  TS 18661-3 has FP_FAST_FMAF128 as an optional 
> > math.h macro (but glibc doesn't define it anywhere at present).
> > 
> > > This patch does this in the PowerPC backend.  As I look at the whole 
> > > issue, at
> > > some point we should do this more in the machine independent portion of 
> > > the
> > > compiler.  I have some initial patches to do this in the c-family files, 
> > > but at
> > > the present time, the patches are not complete, and I need to think about 
> > > it
> > > more.
> > 
> > I think a machine-independent definition (for _FloatN / _FloatNx types in 
> > general) should go along with machine-independent fmafN / fmafNx built-in 
> > functions; when the built-in function is machine-specific, it's natural 
> > for the macro to be as well.
> > 
> > But in any case, the new macro should be documented in cpp.texi alongside 
> > the existing __FP_FAST_FMA* macros (probably in the generic 
> > __FP_FAST_FMAF@var{n} and __FP_FAST_FMAF@var{n}X form).
> 
> This patch adds support for adding the built-in __builtin_fmaf<N> and
> __builtin_fmaf<N>x functions if the target machine supports an appropriate
> fused multiply-add (FMA) instruction.  This patch replaces the original 
> PowerPC
> specific patch.
> 
> Because it involves changes in the built-in support, both the c and c-family
> subdirectories, as well as PowerPC changes, I added the global/release
> maintainers to the To: list.
> 
> I have done a bootstrap and make check on a little endian Power8 with no
> regresions in the tests.  I have verified that the changed and new tests both
> ran fine.
> 
> I have also bootstrapped the changes on an x86-64 compiler, and it 
> bootstrapped
> fine.  I am currently running the unmodified build, but I'm not expecting any
> changes in the test suite.
> 
> Assuming the x86-64 tests also have no regressions, can I check these changes
> into the trunk?
> 
> [gcc]
> 2017-10-02  Michael Meissner  <[email protected]>
> 
>       * builtins.def (BUILT_IN_FMAF16): Add support for fused
>       multiply-add built-in functions for _Float<N> and _Float<N>x
>       types.
>       (BUILT_IN_FMAF32): Likewise.
>       (BUILT_IN_FMAF64): Likewise.
>       (BUILT_IN_FMAF128): Likewise.
>       (BUILT_IN_FMAF32X): Likewise.
>       (BUILT_IN_FMAF64X): Likewise.
>       (BUILT_IN_FMAF128X): Likewise.
>       * builtin-types.def (BT_FN_FLOAT16_FLOAT16_FLOAT16_FLOAT16):
>       Likewise.
>       (BT_FN_FLOAT32_FLOAT32_FLOAT32_FLOAT32): Likewise.
>       (BT_FN_FLOAT64_FLOAT64_FLOAT64_FLOAT64): Likewise.
>       (BT_FN_FLOAT128_FLOAT128_FLOAT128_FLOAT128): Likewise.
>       (BT_FN_FLOAT32X_FLOAT32X_FLOAT32X_FLOAT32X): Likewise.
>       (BT_FN_FLOAT64X_FLOAT64X_FLOAT64X_FLOAT64X): Likewise.
>       (BT_FN_FLOAT128X_FLOAT128X_FLOAT128X_FLOAT128X): Likewise.
>       * builtins.c (expand_builtin_mathfn_ternary): Likewise.
>       (expand_builtin): Add fused multiply-add builtin support for
>       _Float<N> and _Float<N>X types.  Issue a warning if the machine
>       does not provide an appropriate FMA insn.
>       (fold_builtin_3): Add support for fused multiply-add built-in
>       functions for _Float<N> and _Float<N>x types.
>       * config/rs6000/rs6000-builtins.def (FMAF128): Delete creating
>       __builtin_fmaf128, since this is now done in machine independent
>       code.
>       * doc/cpp.texi (__FP_FAST_FMAF16): Document macros set to declare
>       that the appropriate fused multiply-add on _Float<N> and
>       _Float<N>X types is implemented.
>       (__FP_FAST_FMAF32): Likewise.
>       (__FP_FAST_FMAF64): Likewise.
>       (__FP_FAST_FMAF128): Likewise.
>       (__FP_FAST_FMAF32X): Likewise.
>       (__FP_FAST_FMAF64X): Likewise.
>       (__FP_FAST_FMAF128X): Likewise.
> 
> [gcc/c]
> 2017-10-02  Michael Meissner  <[email protected]>
> 
>       * c-decl.c (header_for_builtin_fn): Add support for fused
>       multiply-add built-in functions for _Float<N> and _Float<N>x
>       types.
> 
> [gcc/c-family]
> 2017-10-02  Michael Meissner  <[email protected]>
> 
>       * c-cppbuiltin.c (mode_has_fma): Add support for PowerPC _float128
>       FMA (KFmode) if long double != __float128.
>       (c_cpp_builtins): Define __FP_FAST_FMAF<N> if _Float<N> fused
>       multiply-add is supported.  Define __FP_FAST_FMAF<N>X if
>       _Float<N>x fused multiply-add is supported.
> 
> [gcc/testsuite]
> 2017-10-02  Michael Meissner  <[email protected]>
> 
>       * gcc.target/powerpc/float128-fma2.c: Change error to new
>       warning.
>       * gcc.target/powerpc/float128-fma3.c: New test.
> 
> 
> -- 
> Michael Meissner, IBM
> IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
> email: [email protected], phone: +1 (978) 899-4797


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: [email protected], phone: +1 (978) 899-4797

Index: gcc/builtins.def
===================================================================
--- gcc/builtins.def    (revision 253358)
+++ gcc/builtins.def    (working copy)
@@ -382,6 +382,9 @@ DEF_C99_C90RES_BUILTIN (BUILT_IN_FLOORL,
 DEF_C99_BUILTIN        (BUILT_IN_FMA, "fma", 
BT_FN_DOUBLE_DOUBLE_DOUBLE_DOUBLE, ATTR_MATHFN_FPROUNDING)
 DEF_C99_BUILTIN        (BUILT_IN_FMAF, "fmaf", BT_FN_FLOAT_FLOAT_FLOAT_FLOAT, 
ATTR_MATHFN_FPROUNDING)
 DEF_C99_BUILTIN        (BUILT_IN_FMAL, "fmal", 
BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_MATHFN_FPROUNDING)
+#define FMA_TYPE(F) BT_FN_##F##_##F##_##F##_##F
+DEF_GCC_FLOATN_NX_BUILTINS (BUILT_IN_FMA, "fma", FMA_TYPE, 
ATTR_MATHFN_FPROUNDING)
+#undef FMA_TYPE
 DEF_C99_BUILTIN        (BUILT_IN_FMAX, "fmax", BT_FN_DOUBLE_DOUBLE_DOUBLE, 
ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN        (BUILT_IN_FMAXF, "fmaxf", BT_FN_FLOAT_FLOAT_FLOAT, 
ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN        (BUILT_IN_FMAXL, "fmaxl", 
BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
Index: gcc/builtin-types.def
===================================================================
--- gcc/builtin-types.def       (revision 253358)
+++ gcc/builtin-types.def       (working copy)
@@ -544,6 +544,20 @@ DEF_FUNCTION_TYPE_3 (BT_FN_DOUBLE_DOUBLE
                     BT_DOUBLE, BT_DOUBLE, BT_DOUBLE, BT_DOUBLE)
 DEF_FUNCTION_TYPE_3 (BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE,
                     BT_LONGDOUBLE, BT_LONGDOUBLE, BT_LONGDOUBLE, BT_LONGDOUBLE)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT16_FLOAT16_FLOAT16_FLOAT16,
+                    BT_FLOAT16, BT_FLOAT16, BT_FLOAT16, BT_FLOAT16)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT32_FLOAT32_FLOAT32_FLOAT32,
+                    BT_FLOAT32, BT_FLOAT32, BT_FLOAT32, BT_FLOAT32)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT64_FLOAT64_FLOAT64_FLOAT64,
+                    BT_FLOAT64, BT_FLOAT64, BT_FLOAT64, BT_FLOAT64)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT128_FLOAT128_FLOAT128_FLOAT128,
+                    BT_FLOAT128, BT_FLOAT128, BT_FLOAT128, BT_FLOAT128)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT32X_FLOAT32X_FLOAT32X_FLOAT32X,
+                    BT_FLOAT32X, BT_FLOAT32X, BT_FLOAT32X, BT_FLOAT32X)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT64X_FLOAT64X_FLOAT64X_FLOAT64X,
+                    BT_FLOAT64X, BT_FLOAT64X, BT_FLOAT64X, BT_FLOAT64X)
+DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT128X_FLOAT128X_FLOAT128X_FLOAT128X,
+                    BT_FLOAT128X, BT_FLOAT128X, BT_FLOAT128X, BT_FLOAT128X)
 DEF_FUNCTION_TYPE_3 (BT_FN_FLOAT_FLOAT_FLOAT_INTPTR,
                     BT_FLOAT, BT_FLOAT, BT_FLOAT, BT_INT_PTR)
 DEF_FUNCTION_TYPE_3 (BT_FN_DOUBLE_DOUBLE_DOUBLE_INTPTR,
Index: gcc/builtins.c
===================================================================
--- gcc/builtins.c      (revision 253358)
+++ gcc/builtins.c      (working copy)
@@ -2067,6 +2067,7 @@ expand_builtin_mathfn_ternary (tree exp,
   switch (DECL_FUNCTION_CODE (fndecl))
     {
     CASE_FLT_FN (BUILT_IN_FMA):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_FMA):
       builtin_optab = fma_optab; break;
     default:
       gcc_unreachable ();
@@ -6563,6 +6564,18 @@ expand_builtin (tree exp, rtx target, rt
        return target;
       break;
 
+      /* Warn if the user called __builtin_fmaf{32,64,128} and there is no fast
+        insn to support it.  */
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_FMA):
+      target = expand_builtin_mathfn_ternary (exp, target, subtarget);
+      if (target)
+       return target;
+
+      warning_at (tree_nonartificial_location (exp), 0,
+                 "%KThe built-in function %<__builtin_fmafN ()%> may not be "
+                 "supported", exp);
+      break;
+
     CASE_FLT_FN (BUILT_IN_ILOGB):
       if (! flag_unsafe_math_optimizations)
        break;
@@ -8988,6 +9001,7 @@ fold_builtin_3 (location_t loc, tree fnd
       return fold_builtin_sincos (loc, arg0, arg1, arg2);
 
     CASE_FLT_FN (BUILT_IN_FMA):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_FMA):
       return fold_builtin_fma (loc, arg0, arg1, arg2, type);
 
     CASE_FLT_FN (BUILT_IN_REMQUO):
Index: gcc/config/rs6000/rs6000-builtin.def
===================================================================
--- gcc/config/rs6000/rs6000-builtin.def        (revision 253358)
+++ gcc/config/rs6000/rs6000-builtin.def        (working copy)
@@ -2369,7 +2369,6 @@ BU_FLOAT128_2 (COPYSIGNQ, "copysignq",
    hardware.  These functions use the new 'f128' suffix.  Eventually these
    should be folded into the common built-in function handling. */
 BU_FLOAT128_1_HW (SQRTF128,    "sqrtf128",     CONST, sqrtkf2)
-BU_FLOAT128_3_HW (FMAF128,     "fmaf128",      CONST, fmakf4_hw)
 
 /* 1 argument crypto functions.  */
 BU_CRYPTO_1 (VSBOX,            "vsbox",          CONST, crypto_vsbox)
Index: gcc/doc/cpp.texi
===================================================================
--- gcc/doc/cpp.texi    (revision 253358)
+++ gcc/doc/cpp.texi    (working copy)
@@ -2400,6 +2400,20 @@ was used).  If 1 or more, it indicates t
 those requirements; this does not mean that all relevant language
 features are supported by GCC.
 
+@item __FP_FAST_FMAF16
+@itemx __FP_FAST_FMAF32
+@itemx __FP_FAST_FMAF64
+@itemx __FP_FAST_FMAF128
+@itemx __FP_FAST_FMAF32X
+@itemx __FP_FAST_FMAF64X
+@itemx __FP_FAST_FMAF128X
+This macro is defined with value 1 if the backend supports the
+@code{__builtin_fmaf16}, @code{__builtin_fmaf32},
+@code{__builtin_fmaf64}, @code{__builtin_fmaf128},
+@code{__builtin_fmaf32x}, @code{__builtin_fmaf64x}, or
+@code{__builtin_fmaf128x} builtin functions that do fused multiply-add
+on the types defined in IEEE 754 (IEC 60559).
+
 @item __NO_MATH_ERRNO__
 This macro is defined if @option{-fno-math-errno} is used, or enabled
 by another option such as @option{-ffast-math} or by default.
Index: gcc/c/c-decl.c
===================================================================
--- gcc/c/c-decl.c      (revision 253358)
+++ gcc/c/c-decl.c      (working copy)
@@ -3171,6 +3171,7 @@ header_for_builtin_fn (enum built_in_fun
     CASE_FLT_FN (BUILT_IN_FDIM):
     CASE_FLT_FN (BUILT_IN_FLOOR):
     CASE_FLT_FN (BUILT_IN_FMA):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_FMA):
     CASE_FLT_FN (BUILT_IN_FMAX):
     CASE_FLT_FN (BUILT_IN_FMIN):
     CASE_FLT_FN (BUILT_IN_FMOD):
Index: gcc/c-family/c-cppbuiltin.c
===================================================================
--- gcc/c-family/c-cppbuiltin.c (revision 253358)
+++ gcc/c-family/c-cppbuiltin.c (working copy)
@@ -82,6 +82,11 @@ mode_has_fma (machine_mode mode)
       return !!HAVE_fmadf4;
 #endif
 
+#ifdef HAVE_fmakf4     /* PowerPC if long double != __float128.  */
+    case E_KFmode:
+      return !!HAVE_fmakf4;
+#endif
+
 #ifdef HAVE_fmaxf4
     case E_XFmode:
       return !!HAVE_fmaxf4;
@@ -1119,7 +1124,7 @@ c_cpp_builtins (cpp_reader *pfile)
               floatn_nx_types[i].extended ? "X" : "");
       sprintf (csuffix, "F%d%s", floatn_nx_types[i].n,
               floatn_nx_types[i].extended ? "x" : "");
-      builtin_define_float_constants (prefix, csuffix, "%s", NULL,
+      builtin_define_float_constants (prefix, csuffix, "%s", csuffix,
                                      FLOATN_NX_TYPE_NODE (i));
     }
 
Index: gcc/testsuite/gcc.target/powerpc/float128-fma2.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/float128-fma2.c    (revision 253358)
+++ gcc/testsuite/gcc.target/powerpc/float128-fma2.c    (working copy)
@@ -5,5 +5,5 @@
 __float128
 xfma (__float128 a, __float128 b, __float128 c)
 {
-  return __builtin_fmaf128 (a, b, c); /* { dg-error "ISA 3.0 IEEE 128-bit" } */
+  return __builtin_fmaf128 (a, b, c); /* { dg-warning "__builtin_fmafN" } */
 }
Index: gcc/testsuite/gcc.target/powerpc/float128-fma3.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/float128-fma3.c    (nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/float128-fma3.c    (working copy)
@@ -0,0 +1,63 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mpower9-vector -O2" } */
+
+/* Make sure the appropriate FMA fast macros are defined.  */
+
+#include <math.h>
+
+#ifdef __FP_FAST_FMAF
+float
+do_fmaf (float a, float b, float c)
+{
+  return __builtin_fmaf (a, b, c);
+}
+#else
+#error "__FP_FAST_FMAF should be defined"
+#endif
+
+#ifdef __FP_FAST_FMAF32
+_Float32
+do_fmaf32 (_Float32 a, _Float32 b, _Float32 c)
+{
+  return __builtin_fmaf32 (a, b, -c);
+}
+#else
+#error "__FP_FAST_FMAF32 should be defined"
+#endif
+
+#ifdef __FP_FAST_FMA
+double
+do_fma (double a, double b, double c)
+{
+  return __builtin_fma (a, b, c);
+}
+#else
+#error "__FP_FAST_FMA should be defined"
+#endif
+
+#ifdef __FP_FAST_FMAF64
+_Float64
+do_fmaf64 (_Float64 a, _Float64 b, _Float64 c)
+{
+  return __builtin_fmaf64 (a, b, -c);
+}
+#else
+#error "__FP_FAST_FMAF64 should be defined"
+#endif
+
+#ifdef __FP_FAST_FMAF128
+_Float128
+do_fmaf128 (_Float128 a, _Float128 b, _Float128 c)
+{
+  return __builtin_fmaf128 (a, b, c);
+}
+#else
+#error "__FP_FAST_FMAF128 should be defined"
+#endif
+
+/* { dg-final { scan-assembler {\mfmadds\M|\mxsmadd.sp\M}  } } */
+/* { dg-final { scan-assembler {\mfmsubs\M|\mxsmsub.sp\M}  } } */
+/* { dg-final { scan-assembler {\mfmadd\M|\mxsmadd.dp\M}   } } */
+/* { dg-final { scan-assembler {\mfmsub\M|\mxsmsub.dp\M}   } } */
+/* { dg-final { scan-assembler {\mxsmaddqp\M}              } } */

Re: [PATCH #2], Define __FP_FAST_FMAF128 on PowerPC ISA 3.0

Reply via email to