[clang] [Clang] Add __builtin_(elementwise|reduce)_(max|min)imum (PR #110198)

Francis Visoiu Mistrih via cfe-commits Tue, 01 Oct 2024 10:03:19 -0700

https://github.com/francisvm updated 
https://github.com/llvm/llvm-project/pull/110198


>From af6d6b8f84b4972f6063e195a39eb7e6a29d30ea Mon Sep 17 00:00:00 2001
From: Francis Visoiu Mistrih <[email protected]>
Date: Thu, 26 Sep 2024 18:05:09 -0700
Subject: [PATCH] [Clang] Add __builtin_elementwise|reduce_max|minimum

We have the LLVM intrinsics, and we're missing the clang builtins to be
used directly in code that needs to make the distinction in NaN semantics.
---
 clang/docs/LanguageExtensions.rst             | 158 ++++++++++--------
 clang/include/clang/Basic/Builtins.td         |  24 +++
 .../clang/Basic/DiagnosticSemaKinds.td        |   3 +-
 clang/include/clang/Sema/Sema.h               |   6 +-
 clang/lib/CodeGen/CGBuiltin.cpp               |  39 +++++
 clang/lib/Sema/SemaChecking.cpp               |  49 ++++--
 .../test/CodeGen/builtins-elementwise-math.c  |  76 +++++++++
 clang/test/CodeGen/builtins-reduction-math.c  |  24 +++
 .../CodeGen/strictfp-elementwise-bulitins.cpp |  20 +++
 clang/test/Sema/builtins-elementwise-math.c   |  82 +++++++++
 clang/test/Sema/builtins-reduction-math.c     |  28 ++++
 .../SemaCXX/builtins-elementwise-math.cpp     |  16 ++
 12 files changed, 438 insertions(+), 87 deletions(-)

diff --git a/clang/docs/LanguageExtensions.rst 
b/clang/docs/LanguageExtensions.rst
index 0c6b9b1b8f9ce4..9eefbfecbce514 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -647,66 +647,74 @@ elementwise to the input.
 
 Unless specified otherwise operation(±0) = ±0 and operation(±infinity) = 
±infinity
 
-=========================================== 
================================================================ 
=========================================
-         Name                                Operation                         
                               Supported element types
-=========================================== 
================================================================ 
=========================================
- T __builtin_elementwise_abs(T x)            return the absolute value of a 
number x; the absolute value of   signed integer and floating point types
-                                             the most negative integer remains 
the most negative integer
- T __builtin_elementwise_fma(T x, T y, T z)  fused multiply add, (x * y) +  z. 
                               floating point types
- T __builtin_elementwise_ceil(T x)           return the smallest integral 
value greater than or equal to x    floating point types
- T __builtin_elementwise_sin(T x)            return the sine of x interpreted 
as an angle in radians          floating point types
- T __builtin_elementwise_cos(T x)            return the cosine of x 
interpreted as an angle in radians        floating point types
- T __builtin_elementwise_tan(T x)            return the tangent of x 
interpreted as an angle in radians       floating point types
- T __builtin_elementwise_asin(T x)           return the arcsine of x 
interpreted as an angle in radians       floating point types
- T __builtin_elementwise_acos(T x)           return the arccosine of x 
interpreted as an angle in radians     floating point types
- T __builtin_elementwise_atan(T x)           return the arctangent of x 
interpreted as an angle in radians    floating point types
- T __builtin_elementwise_sinh(T x)           return the hyperbolic sine of 
angle x in radians                 floating point types
- T __builtin_elementwise_cosh(T x)           return the hyperbolic cosine of 
angle x in radians               floating point types
- T __builtin_elementwise_tanh(T x)           return the hyperbolic tangent of 
angle x in radians              floating point types
- T __builtin_elementwise_floor(T x)          return the largest integral value 
less than or equal to x        floating point types
- T __builtin_elementwise_log(T x)            return the natural logarithm of x 
                               floating point types
- T __builtin_elementwise_log2(T x)           return the base 2 logarithm of x  
                               floating point types
- T __builtin_elementwise_log10(T x)          return the base 10 logarithm of x 
                               floating point types
- T __builtin_elementwise_popcount(T x)       return the number of 1 bits in x  
                               integer types 
- T __builtin_elementwise_pow(T x, T y)       return x raised to the power of y 
                               floating point types
- T __builtin_elementwise_bitreverse(T x)     return the integer represented 
after reversing the bits of x     integer types
- T __builtin_elementwise_exp(T x)            returns the base-e exponential, 
e^x, of the specified value      floating point types
- T __builtin_elementwise_exp2(T x)           returns the base-2 exponential, 
2^x, of the specified value      floating point types
-
- T __builtin_elementwise_sqrt(T x)           return the square root of a 
floating-point number                floating point types
- T __builtin_elementwise_roundeven(T x)      round x to the nearest integer 
value in floating point format,   floating point types
-                                             rounding halfway cases to even 
(that is, to the nearest value
-                                             that is an even integer), 
regardless of the current rounding
-                                             direction.
- T __builtin_elementwise_round(T x)          round x to the nearest  integer 
value in floating point format,      floating point types
-                                             rounding halfway cases away from 
zero, regardless of the
-                                             current rounding direction. May 
raise floating-point
-                                             exceptions.
- T __builtin_elementwise_trunc(T x)          return the integral value nearest 
to but no larger in            floating point types
-                                             magnitude than x
-
-  T __builtin_elementwise_nearbyint(T x)     round x to the nearest  integer 
value in floating point format,      floating point types
-                                             rounding according to the current 
rounding direction.
-                                             May not raise the inexact 
floating-point exception. This is
-                                             treated the same as 
``__builtin_elementwise_rint`` unless
-                                             :ref:`FENV_ACCESS is enabled 
<floating-point-environment>`.
-
- T __builtin_elementwise_rint(T x)           round x to the nearest  integer 
value in floating point format,      floating point types
-                                             rounding according to the current 
rounding
-                                             direction. May raise 
floating-point exceptions. This is treated
-                                             the same as 
``__builtin_elementwise_nearbyint`` unless
-                                             :ref:`FENV_ACCESS is enabled 
<floating-point-environment>`.
-
- T __builtin_elementwise_canonicalize(T x)   return the platform specific 
canonical encoding                  floating point types
-                                             of a floating-point number
- T __builtin_elementwise_copysign(T x, T y)  return the magnitude of x with 
the sign of y.                    floating point types
- T __builtin_elementwise_max(T x, T y)       return x or y, whichever is 
larger                               integer and floating point types
- T __builtin_elementwise_min(T x, T y)       return x or y, whichever is 
smaller                              integer and floating point types
- T __builtin_elementwise_add_sat(T x, T y)   return the sum of x and y, 
clamped to the range of               integer types
-                                             representable values for the 
signed/unsigned integer type.
- T __builtin_elementwise_sub_sat(T x, T y)   return the difference of x and y, 
clamped to the range of        integer types
-                                             representable values for the 
signed/unsigned integer type.
-=========================================== 
================================================================ 
=========================================
+============================================== 
====================================================================== 
=========================================
+         Name                                   Operation                      
                                       Supported element types
+============================================== 
====================================================================== 
=========================================
+ T __builtin_elementwise_abs(T x)               return the absolute value of a 
number x; the absolute value of         signed integer and floating point types
+                                                the most negative integer 
remains the most negative integer
+ T __builtin_elementwise_fma(T x, T y, T z)     fused multiply add, (x * y) +  
z.                                      floating point types
+ T __builtin_elementwise_ceil(T x)              return the smallest integral 
value greater than or equal to x          floating point types
+ T __builtin_elementwise_sin(T x)               return the sine of x 
interpreted as an angle in radians                floating point types
+ T __builtin_elementwise_cos(T x)               return the cosine of x 
interpreted as an angle in radians              floating point types
+ T __builtin_elementwise_tan(T x)               return the tangent of x 
interpreted as an angle in radians             floating point types
+ T __builtin_elementwise_asin(T x)              return the arcsine of x 
interpreted as an angle in radians             floating point types
+ T __builtin_elementwise_acos(T x)              return the arccosine of x 
interpreted as an angle in radians           floating point types
+ T __builtin_elementwise_atan(T x)              return the arctangent of x 
interpreted as an angle in radians          floating point types
+ T __builtin_elementwise_sinh(T x)              return the hyperbolic sine of 
angle x in radians                       floating point types
+ T __builtin_elementwise_cosh(T x)              return the hyperbolic cosine 
of angle x in radians                     floating point types
+ T __builtin_elementwise_tanh(T x)              return the hyperbolic tangent 
of angle x in radians                    floating point types
+ T __builtin_elementwise_floor(T x)             return the largest integral 
value less than or equal to x              floating point types
+ T __builtin_elementwise_log(T x)               return the natural logarithm 
of x                                      floating point types
+ T __builtin_elementwise_log2(T x)              return the base 2 logarithm of 
x                                       floating point types
+ T __builtin_elementwise_log10(T x)             return the base 10 logarithm 
of x                                      floating point types
+ T __builtin_elementwise_popcount(T x)          return the number of 1 bits in 
x                                       integer types
+ T __builtin_elementwise_pow(T x, T y)          return x raised to the power 
of y                                      floating point types
+ T __builtin_elementwise_bitreverse(T x)        return the integer represented 
after reversing the bits of x           integer types
+ T __builtin_elementwise_exp(T x)               returns the base-e 
exponential, e^x, of the specified value            floating point types
+ T __builtin_elementwise_exp2(T x)              returns the base-2 
exponential, 2^x, of the specified value            floating point types
+
+ T __builtin_elementwise_sqrt(T x)              return the square root of a 
floating-point number                      floating point types
+ T __builtin_elementwise_roundeven(T x)         round x to the nearest integer 
value in floating point format,         floating point types
+                                                rounding halfway cases to even 
(that is, to the nearest value
+                                                that is an even integer), 
regardless of the current rounding
+                                                direction.
+ T __builtin_elementwise_round(T x)             round x to the nearest  
integer value in floating point format,        floating point types
+                                                rounding halfway cases away 
from zero, regardless of the
+                                                current rounding direction. 
May raise floating-point
+                                                exceptions.
+ T __builtin_elementwise_trunc(T x)             return the integral value 
nearest to but no larger in                  floating point types
+                                                magnitude than x
+
+  T __builtin_elementwise_nearbyint(T x)        round x to the nearest  
integer value in floating point format,        floating point types
+                                                rounding according to the 
current rounding direction.
+                                                May not raise the inexact 
floating-point exception. This is
+                                                treated the same as 
``__builtin_elementwise_rint`` unless
+                                                :ref:`FENV_ACCESS is enabled 
<floating-point-environment>`.
+
+ T __builtin_elementwise_rint(T x)              round x to the nearest  
integer value in floating point format,        floating point types
+                                                rounding according to the 
current rounding
+                                                direction. May raise 
floating-point exceptions. This is treated
+                                                the same as 
``__builtin_elementwise_nearbyint`` unless
+                                                :ref:`FENV_ACCESS is enabled 
<floating-point-environment>`.
+
+ T __builtin_elementwise_canonicalize(T x)      return the platform specific 
canonical encoding                        floating point types
+                                                of a floating-point number
+ T __builtin_elementwise_copysign(T x, T y)     return the magnitude of x with 
the sign of y.                          floating point types
+ T __builtin_elementwise_max(T x, T y)          return x or y, whichever is 
larger                                     integer and floating point types
+ T __builtin_elementwise_min(T x, T y)          return x or y, whichever is 
smaller                                    integer and floating point types
+ T __builtin_elementwise_add_sat(T x, T y)      return the sum of x and y, 
clamped to the range of                     integer types
+                                                representable values for the 
signed/unsigned integer type.
+ T __builtin_elementwise_sub_sat(T x, T y)      return the difference of x and 
y, clamped to the range of              integer types
+                                                representable values for the 
signed/unsigned integer type.
+ T __builtin_elementwise_maximum(T x, T y)      return x or y, whichever is 
larger. Follows IEEE 754-2019              floating point types
+                                                semantics, see `LangRef
+                                                
<http://llvm.org/docs/LangRef.html#llvm-min-intrinsics-comparation>`_
+                                                for the comparison.
+ T __builtin_elementwise_minimum(T x, T y)      return x or y, whichever is 
smaller. Follows IEEE 754-2019             floating point types
+                                                semantics, see `LangRef
+                                                
<http://llvm.org/docs/LangRef.html#llvm-min-intrinsics-comparation>`_
+                                                for the comparison.
+============================================== 
====================================================================== 
=========================================
 
 
 *Reduction Builtins*
@@ -731,21 +739,29 @@ Example:
 
 Let ``VT`` be a vector type and ``ET`` the element type of ``VT``.
 
-======================================= 
================================================================ 
==================================
-         Name                            Operation                             
                           Supported element types
-======================================= 
================================================================ 
==================================
- ET __builtin_reduce_max(VT a)           return x or y, whichever is larger; 
If exactly one argument is   integer and floating point types
+======================================= 
====================================================================== 
==================================
+         Name                            Operation                             
                                 Supported element types
+======================================= 
====================================================================== 
==================================
+ ET __builtin_reduce_max(VT a)           return x or y, whichever is larger; 
If exactly one argument is         integer and floating point types
                                          a NaN, return the other argument. If 
both arguments are NaNs,
                                          fmax() return a NaN.
- ET __builtin_reduce_min(VT a)           return x or y, whichever is smaller; 
If exactly one argument     integer and floating point types
+ ET __builtin_reduce_min(VT a)           return x or y, whichever is smaller; 
If exactly one argument           integer and floating point types
                                          is a NaN, return the other argument. 
If both arguments are
                                          NaNs, fmax() return a NaN.
- ET __builtin_reduce_add(VT a)           \+                                    
                           integer types
- ET __builtin_reduce_mul(VT a)           \*                                    
                           integer types
- ET __builtin_reduce_and(VT a)           &                                     
                           integer types
- ET __builtin_reduce_or(VT a)            \|                                    
                           integer types
- ET __builtin_reduce_xor(VT a)           ^                                     
                           integer types
-======================================= 
================================================================ 
==================================
+ ET __builtin_reduce_add(VT a)           \+                                    
                                 integer types
+ ET __builtin_reduce_mul(VT a)           \*                                    
                                 integer types
+ ET __builtin_reduce_and(VT a)           &                                     
                                 integer types
+ ET __builtin_reduce_or(VT a)            \|                                    
                                 integer types
+ ET __builtin_reduce_xor(VT a)           ^                                     
                                 integer types
+ ET __builtin_reduce_maximum(VT a)       return the largest element of the 
vector. Follows IEEE 754-2019        floating point types
+                                         semantics, see `LangRef
+                                         
<http://llvm.org/docs/LangRef.html#llvm-min-intrinsics-comparation>`_
+                                         for the comparison.
+ ET __builtin_reduce_minimum(VT a)       return the smallest element of the 
vector. Follows IEEE 754-2019       floating point types
+                                         semantics, see `LangRef
+                                         
<http://llvm.org/docs/LangRef.html#llvm-min-intrinsics-comparation>`_
+                                         for the comparison.
+======================================= 
====================================================================== 
==================================
 
 Matrix Types
 ============
diff --git a/clang/include/clang/Basic/Builtins.td 
b/clang/include/clang/Basic/Builtins.td
index 33791270800c9d..d26f5b9a6c8bdc 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -1268,6 +1268,18 @@ def ElementwiseMin : Builtin {
   let Prototype = "void(...)";
 }
 
+def ElementwiseMaximum : Builtin {
+  let Spellings = ["__builtin_elementwise_maximum"];
+  let Attributes = [NoThrow, Const, CustomTypeChecking];
+  let Prototype = "void(...)";
+}
+
+def ElementwiseMinimum : Builtin {
+  let Spellings = ["__builtin_elementwise_minimum"];
+  let Attributes = [NoThrow, Const, CustomTypeChecking];
+  let Prototype = "void(...)";
+}
+
 def ElementwiseCeil : Builtin {
   let Spellings = ["__builtin_elementwise_ceil"];
   let Attributes = [NoThrow, Const, CustomTypeChecking];
@@ -1436,6 +1448,18 @@ def ReduceMin : Builtin {
   let Prototype = "void(...)";
 }
 
+def ReduceMaximum : Builtin {
+  let Spellings = ["__builtin_reduce_maximum"];
+  let Attributes = [NoThrow, Const, CustomTypeChecking];
+  let Prototype = "void(...)";
+}
+
+def ReduceMinimum : Builtin {
+  let Spellings = ["__builtin_reduce_minimum"];
+  let Attributes = [NoThrow, Const, CustomTypeChecking];
+  let Prototype = "void(...)";
+}
+
 def ReduceXor : Builtin {
   let Spellings = ["__builtin_reduce_xor"];
   let Attributes = [NoThrow, Const, CustomTypeChecking];
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td 
b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index f3d5d4c56606cc..4f598f7517cf23 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -12216,7 +12216,8 @@ def err_builtin_invalid_arg_type: Error <
   "a floating point type|"
   "a vector of integers|"
   "an unsigned integer|"
-  "an 'int'}1 (was %2)">;
+  "an 'int'|"
+  "a vector of floating points}1 (was %2)">;
 
 def err_builtin_matrix_disabled: Error<
   "matrix types extension is disabled. Pass -fenable-matrix to enable it">;
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index e1c3a99cfa167e..2549be38beb2b6 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -2381,7 +2381,8 @@ class Sema final : public SemaBase {
   bool CheckFunctionCall(FunctionDecl *FDecl, CallExpr *TheCall,
                          const FunctionProtoType *Proto);
 
-  bool BuiltinVectorMath(CallExpr *TheCall, QualType &Res);
+  /// \param FPOnly restricts the arguments to floating-point types.
+  bool BuiltinVectorMath(CallExpr *TheCall, QualType &Res, bool FPOnly = 
false);
   bool BuiltinVectorToScalarMath(CallExpr *TheCall);
 
   /// Handles the checks for format strings, non-POD arguments to vararg
@@ -2573,7 +2574,8 @@ class Sema final : public SemaBase {
   ExprResult AtomicOpsOverloaded(ExprResult TheCallResult,
                                  AtomicExpr::AtomicOp Op);
 
-  bool BuiltinElementwiseMath(CallExpr *TheCall);
+  /// \param FPOnly restricts the arguments to floating-point types.
+  bool BuiltinElementwiseMath(CallExpr *TheCall, bool FPOnly = false);
   bool PrepareBuiltinReduceMathOneArgCall(CallExpr *TheCall);
 
   bool BuiltinNonDeterministicValue(CallExpr *TheCall);
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 9033cd1ccd781d..cbc51ebbe97747 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -3960,6 +3960,22 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl 
GD, unsigned BuiltinID,
     return RValue::get(Result);
   }
 
+  case Builtin::BI__builtin_elementwise_maximum: {
+    Value *Op0 = EmitScalarExpr(E->getArg(0));
+    Value *Op1 = EmitScalarExpr(E->getArg(1));
+    Value *Result = Builder.CreateBinaryIntrinsic(llvm::Intrinsic::maximum, 
Op0,
+                                                  Op1, nullptr, "elt.maximum");
+    return RValue::get(Result);
+  }
+
+  case Builtin::BI__builtin_elementwise_minimum: {
+    Value *Op0 = EmitScalarExpr(E->getArg(0));
+    Value *Op1 = EmitScalarExpr(E->getArg(1));
+    Value *Result = Builder.CreateBinaryIntrinsic(llvm::Intrinsic::minimum, 
Op0,
+                                                  Op1, nullptr, "elt.minimum");
+    return RValue::get(Result);
+  }
+
   case Builtin::BI__builtin_reduce_max: {
     auto GetIntrinsicID = [this](QualType QT) {
       if (auto *VecTy = QT->getAs<VectorType>())
@@ -4013,6 +4029,29 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl 
GD, unsigned BuiltinID,
     return RValue::get(emitBuiltinWithOneOverloadedType<1>(
         *this, E, llvm::Intrinsic::vector_reduce_and, "rdx.and"));
 
+  case Builtin::BI__builtin_reduce_maximum: {
+    auto GetIntrinsicID = [](QualType QT) {
+      if (auto *VecTy = QT->getAs<VectorType>())
+        QT = VecTy->getElementType();
+      assert(QT->isFloatingType() && "must have a float here");
+      return llvm::Intrinsic::vector_reduce_fmaximum;
+    };
+    return RValue::get(emitBuiltinWithOneOverloadedType<1>(
+        *this, E, GetIntrinsicID(E->getArg(0)->getType()), "rdx.maximum"));
+  }
+
+  case Builtin::BI__builtin_reduce_minimum: {
+    auto GetIntrinsicID = [](QualType QT) {
+      if (auto *VecTy = QT->getAs<VectorType>())
+        QT = VecTy->getElementType();
+      assert(QT->isFloatingType() && "must have a float here");
+      return llvm::Intrinsic::vector_reduce_fminimum;
+    };
+
+    return RValue::get(emitBuiltinWithOneOverloadedType<1>(
+        *this, E, GetIntrinsicID(E->getArg(0)->getType()), "rdx.minimum"));
+  }
+
   case Builtin::BI__builtin_matrix_transpose: {
     auto *MatrixTy = E->getArg(0)->getType()->castAs<ConstantMatrixType>();
     Value *MatValue = EmitScalarExpr(E->getArg(0));
diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index af1dc21594da8a..2b033f3f7fb367 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -2755,15 +2755,10 @@ Sema::CheckBuiltinFunctionCall(FunctionDecl *FDecl, 
unsigned BuiltinID,
 
   // These builtins restrict the element type to floating point
   // types only, and take in two arguments.
+  case Builtin::BI__builtin_elementwise_minimum:
+  case Builtin::BI__builtin_elementwise_maximum:
   case Builtin::BI__builtin_elementwise_pow: {
-    if (BuiltinElementwiseMath(TheCall))
-      return ExprError();
-
-    QualType ArgTy = TheCall->getArg(0)->getType();
-    if (checkFPMathBuiltinElementType(*this, TheCall->getArg(0)->getBeginLoc(),
-                                      ArgTy, 1) ||
-        checkFPMathBuiltinElementType(*this, TheCall->getArg(1)->getBeginLoc(),
-                                      ArgTy, 2))
+    if (BuiltinElementwiseMath(TheCall, /*FPOnly=*/true))
       return ExprError();
     break;
   }
@@ -2867,6 +2862,29 @@ Sema::CheckBuiltinFunctionCall(FunctionDecl *FDecl, 
unsigned BuiltinID,
     TheCall->setType(ElTy);
     break;
   }
+  case Builtin::BI__builtin_reduce_maximum:
+  case Builtin::BI__builtin_reduce_minimum: {
+    if (PrepareBuiltinReduceMathOneArgCall(TheCall))
+      return ExprError();
+
+    const Expr *Arg = TheCall->getArg(0);
+    const auto *TyA = Arg->getType()->getAs<VectorType>();
+
+    QualType ElTy;
+    if (TyA)
+      ElTy = TyA->getElementType();
+    else if (Arg->getType()->isSizelessVectorType())
+      ElTy = Arg->getType()->getSizelessVectorEltType(Context);
+
+    if (ElTy.isNull() || !ElTy->isFloatingType()) {
+      Diag(Arg->getBeginLoc(), diag::err_builtin_invalid_arg_type)
+          << 1 << /* vector of floating points */ 9 << Arg->getType();
+      return ExprError();
+    }
+
+    TheCall->setType(ElTy);
+    break;
+  }
 
   // These builtins support vectors of integers only.
   // TODO: ADD/MUL should support floating-point types.
@@ -14377,9 +14395,9 @@ bool 
Sema::PrepareBuiltinElementwiseMathOneArgCall(CallExpr *TheCall) {
   return false;
 }
 
-bool Sema::BuiltinElementwiseMath(CallExpr *TheCall) {
+bool Sema::BuiltinElementwiseMath(CallExpr *TheCall, bool FPOnly) {
   QualType Res;
-  if (BuiltinVectorMath(TheCall, Res))
+  if (BuiltinVectorMath(TheCall, Res, FPOnly))
     return true;
   TheCall->setType(Res);
   return false;
@@ -14398,7 +14416,7 @@ bool Sema::BuiltinVectorToScalarMath(CallExpr *TheCall) 
{
   return false;
 }
 
-bool Sema::BuiltinVectorMath(CallExpr *TheCall, QualType &Res) {
+bool Sema::BuiltinVectorMath(CallExpr *TheCall, QualType &Res, bool FPOnly) {
   if (checkArgCount(TheCall, 2))
     return true;
 
@@ -14418,8 +14436,13 @@ bool Sema::BuiltinVectorMath(CallExpr *TheCall, 
QualType &Res) {
                 diag::err_typecheck_call_different_arg_types)
            << TyA << TyB;
 
-  if (checkMathBuiltinElementType(*this, A.get()->getBeginLoc(), TyA, 1))
-    return true;
+  if (FPOnly) {
+    if (checkFPMathBuiltinElementType(*this, A.get()->getBeginLoc(), TyA, 1))
+      return true;
+  } else {
+    if (checkMathBuiltinElementType(*this, A.get()->getBeginLoc(), TyA, 1))
+      return true;
+  }
 
   TheCall->setArg(0, A.get());
   TheCall->setArg(1, B.get());
diff --git a/clang/test/CodeGen/builtins-elementwise-math.c 
b/clang/test/CodeGen/builtins-elementwise-math.c
index 7e094a52653ef0..4a02b2f6467c34 100644
--- a/clang/test/CodeGen/builtins-elementwise-math.c
+++ b/clang/test/CodeGen/builtins-elementwise-math.c
@@ -169,6 +169,82 @@ void test_builtin_elementwise_sub_sat(float f1, float f2, 
double d1, double d2,
   i1 = __builtin_elementwise_sub_sat(1, 'a');
 }
 
+void test_builtin_elementwise_maximum(float f1, float f2, double d1, double d2,
+                                      float4 vf1, float4 vf2, long long int i1,
+                                      long long int i2, si8 vi1, si8 vi2,
+                                      unsigned u1, unsigned u2, u4 vu1, u4 vu2,
+                                      _BitInt(31) bi1, _BitInt(31) bi2,
+                                      unsigned _BitInt(55) bu1, unsigned 
_BitInt(55) bu2) {
+  // CHECK-LABEL: define void @test_builtin_elementwise_maximum(
+  // CHECK:      [[F1:%.+]] = load float, ptr %f1.addr, align 4
+  // CHECK-NEXT: [[F2:%.+]] = load float, ptr %f2.addr, align 4
+  // CHECK-NEXT:  call float @llvm.maximum.f32(float [[F1]], float [[F2]])
+  f1 = __builtin_elementwise_maximum(f1, f2);
+
+  // CHECK:      [[D1:%.+]] = load double, ptr %d1.addr, align 8
+  // CHECK-NEXT: [[D2:%.+]] = load double, ptr %d2.addr, align 8
+  // CHECK-NEXT: call double @llvm.maximum.f64(double [[D1]], double [[D2]])
+  d1 = __builtin_elementwise_maximum(d1, d2);
+
+  // CHECK:      [[D2:%.+]] = load double, ptr %d2.addr, align 8
+  // CHECK-NEXT: call double @llvm.maximum.f64(double 2.000000e+01, double 
[[D2]])
+  d1 = __builtin_elementwise_maximum(20.0, d2);
+
+  // CHECK:      [[VF1:%.+]] = load <4 x float>, ptr %vf1.addr, align 16
+  // CHECK-NEXT: [[VF2:%.+]] = load <4 x float>, ptr %vf2.addr, align 16
+  // CHECK-NEXT: call <4 x float> @llvm.maximum.v4f32(<4 x float> [[VF1]], <4 
x float> [[VF2]])
+  vf1 = __builtin_elementwise_maximum(vf1, vf2);
+
+  // CHECK:      [[CVF1:%.+]] = load <4 x float>, ptr %cvf1, align 16
+  // CHECK-NEXT: [[VF2:%.+]] = load <4 x float>, ptr %vf2.addr, align 16
+  // CHECK-NEXT: call <4 x float> @llvm.maximum.v4f32(<4 x float> [[CVF1]], <4 
x float> [[VF2]])
+  const float4 cvf1 = vf1;
+  vf1 = __builtin_elementwise_maximum(cvf1, vf2);
+
+  // CHECK:      [[VF2:%.+]] = load <4 x float>, ptr %vf2.addr, align 16
+  // CHECK-NEXT: [[CVF1:%.+]] = load <4 x float>, ptr %cvf1, align 16
+  // CHECK-NEXT: call <4 x float> @llvm.maximum.v4f32(<4 x float> [[VF2]], <4 
x float> [[CVF1]])
+  vf1 = __builtin_elementwise_maximum(vf2, cvf1);
+}
+
+void test_builtin_elementwise_minimum(float f1, float f2, double d1, double d2,
+                                      float4 vf1, float4 vf2, long long int i1,
+                                      long long int i2, si8 vi1, si8 vi2,
+                                      unsigned u1, unsigned u2, u4 vu1, u4 vu2,
+                                      _BitInt(31) bi1, _BitInt(31) bi2,
+                                      unsigned _BitInt(55) bu1, unsigned 
_BitInt(55) bu2) {
+  // CHECK-LABEL: define void @test_builtin_elementwise_minimum(
+  // CHECK:      [[F1:%.+]] = load float, ptr %f1.addr, align 4
+  // CHECK-NEXT: [[F2:%.+]] = load float, ptr %f2.addr, align 4
+  // CHECK-NEXT:  call float @llvm.minimum.f32(float [[F1]], float [[F2]])
+  f1 = __builtin_elementwise_minimum(f1, f2);
+
+  // CHECK:      [[D1:%.+]] = load double, ptr %d1.addr, align 8
+  // CHECK-NEXT: [[D2:%.+]] = load double, ptr %d2.addr, align 8
+  // CHECK-NEXT: call double @llvm.minimum.f64(double [[D1]], double [[D2]])
+  d1 = __builtin_elementwise_minimum(d1, d2);
+
+  // CHECK:      [[D1:%.+]] = load double, ptr %d1.addr, align 8
+  // CHECK-NEXT: call double @llvm.minimum.f64(double [[D1]], double 
2.000000e+00)
+  d1 = __builtin_elementwise_minimum(d1, 2.0);
+
+  // CHECK:      [[VF1:%.+]] = load <4 x float>, ptr %vf1.addr, align 16
+  // CHECK-NEXT: [[VF2:%.+]] = load <4 x float>, ptr %vf2.addr, align 16
+  // CHECK-NEXT: call <4 x float> @llvm.minimum.v4f32(<4 x float> [[VF1]], <4 
x float> [[VF2]])
+  vf1 = __builtin_elementwise_minimum(vf1, vf2);
+
+  // CHECK:      [[CVF1:%.+]] = load <4 x float>, ptr %cvf1, align 16
+  // CHECK-NEXT: [[VF2:%.+]] = load <4 x float>, ptr %vf2.addr, align 16
+  // CHECK-NEXT: call <4 x float> @llvm.minimum.v4f32(<4 x float> [[CVF1]], <4 
x float> [[VF2]])
+  const float4 cvf1 = vf1;
+  vf1 = __builtin_elementwise_minimum(cvf1, vf2);
+
+  // CHECK:      [[VF2:%.+]] = load <4 x float>, ptr %vf2.addr, align 16
+  // CHECK-NEXT: [[CVF1:%.+]] = load <4 x float>, ptr %cvf1, align 16
+  // CHECK-NEXT: call <4 x float> @llvm.minimum.v4f32(<4 x float> [[VF2]], <4 
x float> [[CVF1]])
+  vf1 = __builtin_elementwise_minimum(vf2, cvf1);
+}
+
 void test_builtin_elementwise_max(float f1, float f2, double d1, double d2,
                                   float4 vf1, float4 vf2, long long int i1,
                                   long long int i2, si8 vi1, si8 vi2,
diff --git a/clang/test/CodeGen/builtins-reduction-math.c 
b/clang/test/CodeGen/builtins-reduction-math.c
index acafe9222d59fd..e12fd729c84c0b 100644
--- a/clang/test/CodeGen/builtins-reduction-math.c
+++ b/clang/test/CodeGen/builtins-reduction-math.c
@@ -138,6 +138,30 @@ void test_builtin_reduce_and(si8 vi1, u4 vu1) {
   unsigned r3 = __builtin_reduce_and(vu1);
 }
 
+void test_builtin_reduce_maximum(float4 vf1) {
+  // CHECK-LABEL: define void @test_builtin_reduce_maximum(
+  // CHECK:      [[VF1:%.+]] = load <4 x float>, ptr %vf1.addr, align 16
+  // CHECK-NEXT: call float @llvm.vector.reduce.fmaximum.v4f32(<4 x float> 
[[VF1]])
+  float r1 = __builtin_reduce_maximum(vf1);
+
+  // CHECK:      [[VF1_AS1:%.+]] = load <4 x float>, ptr addrspace(1) 
@vf1_as_one, align 16
+  // CHECK-NEXT: [[RDX1:%.+]] = call float 
@llvm.vector.reduce.fmaximum.v4f32(<4 x float> [[VF1_AS1]])
+  // CHECK-NEXT: fpext float [[RDX1]] to double
+  const double r4 = __builtin_reduce_maximum(vf1_as_one);
+}
+
+void test_builtin_reduce_minimum(float4 vf1) {
+  // CHECK-LABEL: define void @test_builtin_reduce_minimum(
+  // CHECK:      [[VF1:%.+]] = load <4 x float>, ptr %vf1.addr, align 16
+  // CHECK-NEXT: call float @llvm.vector.reduce.fminimum.v4f32(<4 x float> 
[[VF1]])
+  float r1 = __builtin_reduce_minimum(vf1);
+
+  // CHECK:      [[VF1_AS1:%.+]] = load <4 x float>, ptr addrspace(1) 
@vf1_as_one, align 16
+  // CHECK-NEXT: [[RDX1:%.+]] = call float 
@llvm.vector.reduce.fminimum.v4f32(<4 x float> [[VF1_AS1]])
+  // CHECK-NEXT: fpext float [[RDX1]] to double
+  const double r4 = __builtin_reduce_minimum(vf1_as_one);
+}
+
 #if defined(__ARM_FEATURE_SVE)
 #include <arm_sve.h>
 
diff --git a/clang/test/CodeGen/strictfp-elementwise-bulitins.cpp 
b/clang/test/CodeGen/strictfp-elementwise-bulitins.cpp
index 55ba17a1955800..dc5674ddab233c 100644
--- a/clang/test/CodeGen/strictfp-elementwise-bulitins.cpp
+++ b/clang/test/CodeGen/strictfp-elementwise-bulitins.cpp
@@ -47,6 +47,26 @@ float4 strict_elementwise_min(float4 a, float4 b) {
   return __builtin_elementwise_min(a, b);
 }
 
+// CHECK-LABEL: define dso_local noundef <4 x float> 
@_Z26strict_elementwise_maximumDv4_fS_
+// CHECK-SAME: (<4 x float> noundef [[A:%.*]], <4 x float> noundef [[B:%.*]]) 
local_unnamed_addr #[[ATTR2]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[ELT_MAXIMUM:%.*]] = tail call <4 x float> 
@llvm.maximum.v4f32(<4 x float> [[A]], <4 x float> [[B]]) #[[ATTR4]]
+// CHECK-NEXT:    ret <4 x float> [[ELT_MAXIMUM]]
+//
+float4 strict_elementwise_maximum(float4 a, float4 b) {
+  return __builtin_elementwise_maximum(a, b);
+}
+
+// CHECK-LABEL: define dso_local noundef <4 x float> 
@_Z26strict_elementwise_minimumDv4_fS_
+// CHECK-SAME: (<4 x float> noundef [[A:%.*]], <4 x float> noundef [[B:%.*]]) 
local_unnamed_addr #[[ATTR2]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[ELT_MINIMUM:%.*]] = tail call <4 x float> 
@llvm.minimum.v4f32(<4 x float> [[A]], <4 x float> [[B]]) #[[ATTR4]]
+// CHECK-NEXT:    ret <4 x float> [[ELT_MINIMUM]]
+//
+float4 strict_elementwise_minimum(float4 a, float4 b) {
+  return __builtin_elementwise_minimum(a, b);
+}
+
 // CHECK-LABEL: define dso_local noundef <4 x float> 
@_Z23strict_elementwise_ceilDv4_f
 // CHECK-SAME: (<4 x float> noundef [[A:%.*]]) local_unnamed_addr #[[ATTR2]] {
 // CHECK-NEXT:  entry:
diff --git a/clang/test/Sema/builtins-elementwise-math.c 
b/clang/test/Sema/builtins-elementwise-math.c
index 1727be1d6286d5..6eef5874391916 100644
--- a/clang/test/Sema/builtins-elementwise-math.c
+++ b/clang/test/Sema/builtins-elementwise-math.c
@@ -273,6 +273,88 @@ void test_builtin_elementwise_min(int i, short s, double 
d, float4 v, int3 iv, u
   // expected-error@-1 {{1st argument must be a vector, integer or floating 
point type (was '_Complex float')}}
 }
 
+void test_builtin_elementwise_maximum(int i, short s, float f, double d, 
float4 v, int3 iv, unsigned3 uv, int *p) {
+  i = __builtin_elementwise_maximum(p, d);
+  // expected-error@-1 {{arguments are of different types ('int *' vs 
'double')}}
+
+  struct Foo foo = __builtin_elementwise_maximum(d, d);
+  // expected-error@-1 {{initializing 'struct Foo' with an expression of 
incompatible type 'double'}}
+
+  i = __builtin_elementwise_maximum(i);
+  // expected-error@-1 {{too few arguments to function call, expected 2, have 
1}}
+
+  i = __builtin_elementwise_maximum();
+  // expected-error@-1 {{too few arguments to function call, expected 2, have 
0}}
+
+  i = __builtin_elementwise_maximum(i, i, i);
+  // expected-error@-1 {{too many arguments to function call, expected 2, have 
3}}
+
+  i = __builtin_elementwise_maximum(v, iv);
+  // expected-error@-1 {{arguments are of different types ('float4' (vector of 
4 'float' values) vs 'int3' (vector of 3 'int' values))}}
+
+  i = __builtin_elementwise_maximum(uv, iv);
+  // expected-error@-1 {{arguments are of different types ('unsigned3' (vector 
of 3 'unsigned int' values) vs 'int3' (vector of 3 'int' values))}}
+
+  d = __builtin_elementwise_maximum(f, d);
+
+  v = __builtin_elementwise_maximum(v, v);
+
+  i = __builtin_elementwise_maximum(iv, iv);
+  // expected-error@-1 {{1st argument must be a floating point type (was 
'int3' (vector of 3 'int' values))}}
+
+  i = __builtin_elementwise_maximum(i, i);
+  // expected-error@-1 {{1st argument must be a floating point type (was 
'int')}}
+
+  int A[10];
+  A = __builtin_elementwise_maximum(A, A);
+  // expected-error@-1 {{1st argument must be a floating point type (was 'int 
*')}}
+
+  _Complex float c1, c2;
+  c1 = __builtin_elementwise_maximum(c1, c2);
+  // expected-error@-1 {{1st argument must be a floating point type (was 
'_Complex float')}}
+}
+
+void test_builtin_elementwise_minimum(int i, short s, float f, double d, 
float4 v, int3 iv, unsigned3 uv, int *p) {
+  i = __builtin_elementwise_minimum(p, d);
+  // expected-error@-1 {{arguments are of different types ('int *' vs 
'double')}}
+
+  struct Foo foo = __builtin_elementwise_minimum(d, d);
+  // expected-error@-1 {{initializing 'struct Foo' with an expression of 
incompatible type 'double'}}
+
+  i = __builtin_elementwise_minimum(i);
+  // expected-error@-1 {{too few arguments to function call, expected 2, have 
1}}
+
+  i = __builtin_elementwise_minimum();
+  // expected-error@-1 {{too few arguments to function call, expected 2, have 
0}}
+
+  i = __builtin_elementwise_minimum(i, i, i);
+  // expected-error@-1 {{too many arguments to function call, expected 2, have 
3}}
+
+  i = __builtin_elementwise_minimum(v, iv);
+  // expected-error@-1 {{arguments are of different types ('float4' (vector of 
4 'float' values) vs 'int3' (vector of 3 'int' values))}}
+
+  i = __builtin_elementwise_minimum(uv, iv);
+  // expected-error@-1 {{arguments are of different types ('unsigned3' (vector 
of 3 'unsigned int' values) vs 'int3' (vector of 3 'int' values))}}
+
+  d = __builtin_elementwise_minimum(f, d);
+
+  v = __builtin_elementwise_minimum(v, v);
+
+  i = __builtin_elementwise_minimum(iv, iv);
+  // expected-error@-1 {{1st argument must be a floating point type (was 
'int3' (vector of 3 'int' values))}}
+
+  i = __builtin_elementwise_minimum(i, i);
+  // expected-error@-1 {{1st argument must be a floating point type (was 
'int')}}
+
+  int A[10];
+  A = __builtin_elementwise_minimum(A, A);
+  // expected-error@-1 {{1st argument must be a floating point type (was 'int 
*')}}
+
+  _Complex float c1, c2;
+  c1 = __builtin_elementwise_minimum(c1, c2);
+  // expected-error@-1 {{1st argument must be a floating point type (was 
'_Complex float')}}
+}
+
 void test_builtin_elementwise_bitreverse(int i, float f, double d, float4 v, 
int3 iv, unsigned u, unsigned4 uv) {
 
   struct Foo s = __builtin_elementwise_bitreverse(i);
diff --git a/clang/test/Sema/builtins-reduction-math.c 
b/clang/test/Sema/builtins-reduction-math.c
index 9d5eed75eb8141..9b0d91bfd6e3d2 100644
--- a/clang/test/Sema/builtins-reduction-math.c
+++ b/clang/test/Sema/builtins-reduction-math.c
@@ -120,3 +120,31 @@ void test_builtin_reduce_and(int i, float4 v, int3 iv) {
   i = __builtin_reduce_and(v);
   // expected-error@-1 {{1st argument must be a vector of integers (was 
'float4' (vector of 4 'float' values))}}
 }
+
+void test_builtin_reduce_maximum(int i, float4 v, int3 iv) {
+  struct Foo s = __builtin_reduce_maximum(v);
+  // expected-error@-1 {{initializing 'struct Foo' with an expression of 
incompatible type 'float'}}
+
+  i = __builtin_reduce_maximum(v, v);
+  // expected-error@-1 {{too many arguments to function call, expected 1, have 
2}}
+
+  i = __builtin_reduce_maximum();
+  // expected-error@-1 {{too few arguments to function call, expected 1, have 
0}}
+
+  i = __builtin_reduce_maximum(i);
+  // expected-error@-1 {{1st argument must be a vector of floating points (was 
'int')}}
+}
+
+void test_builtin_reduce_minimum(int i, float4 v, int3 iv) {
+  struct Foo s = __builtin_reduce_minimum(v);
+  // expected-error@-1 {{initializing 'struct Foo' with an expression of 
incompatible type 'float'}}
+
+  i = __builtin_reduce_minimum(v, v);
+  // expected-error@-1 {{too many arguments to function call, expected 1, have 
2}}
+
+  i = __builtin_reduce_minimum();
+  // expected-error@-1 {{too few arguments to function call, expected 1, have 
0}}
+
+  i = __builtin_reduce_minimum(i);
+  // expected-error@-1 {{1st argument must be a vector of floating points (was 
'int')}}
+}
diff --git a/clang/test/SemaCXX/builtins-elementwise-math.cpp 
b/clang/test/SemaCXX/builtins-elementwise-math.cpp
index c3d8bc593c0bbc..c83ef3bedb0e81 100644
--- a/clang/test/SemaCXX/builtins-elementwise-math.cpp
+++ b/clang/test/SemaCXX/builtins-elementwise-math.cpp
@@ -76,6 +76,22 @@ void test_builtin_elementwise_min_fp() {
   static_assert(!is_const<decltype(__builtin_elementwise_min(a, a))>::value);
 }
 
+void test_builtin_elementwise_maximum() {
+  const float a = 2.0f;
+  float b = 1.0f;
+  static_assert(!is_const<decltype(__builtin_elementwise_maximum(a, 
b))>::value);
+  static_assert(!is_const<decltype(__builtin_elementwise_maximum(b, 
a))>::value);
+  static_assert(!is_const<decltype(__builtin_elementwise_maximum(a, 
a))>::value);
+}
+
+void test_builtin_elementwise_minimum() {
+  const float a = 2.0f;
+  float b = 1.0f;
+  static_assert(!is_const<decltype(__builtin_elementwise_minimum(a, 
b))>::value);
+  static_assert(!is_const<decltype(__builtin_elementwise_minimum(b, 
a))>::value);
+  static_assert(!is_const<decltype(__builtin_elementwise_minimum(a, 
a))>::value);
+}
+
 void test_builtin_elementwise_ceil() {
   const float a = 42.0;
   float b = 42.3;

_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Add __builtin_(elementwise|reduce)_(max|min)imum (PR #110198)

Reply via email to