[clang] [HLSL][Docs] Add documentation for HLSL functions (PR #75397)

2023-12-13 Thread Tex Riddell via cfe-commits


@@ -0,0 +1,316 @@
+===
+HLSL Function Calls
+===
+
+.. contents::
+   :local:
+
+Introduction
+
+
+This document describes the design and implementation of HLSL's function call
+semantics in Clang. This includes details related to argument conversion and
+parameter lifetimes.
+
+This document does not seek to serve as official documentation for HLSL's
+call semantics, but does provide an overview to assist a reader. The
+authoritative documentation for HLSL's language semantics is the `draft 
language
+specification `_.
+
+Argument Semantics
+==
+
+In HLSL, all function arguments are passed by value in and out of functions.
+HLSL has 3 keywords which denote the parameter semantics (``in``, ``out`` and
+``inout``). In a function declaration a parameter may be annotated any of the
+following ways:
+
+#.  - denotes input
+#. ``in`` - denotes input
+#. ``out`` - denotes output
+#. ``in out`` - denotes input and output
+#. ``out in`` - denotes input and output
+#. ``inout`` - denotes input and output
+
+Parameters that are exclusively input behave like C/C++ parameters that are
+passed by value.
+
+For parameters that are output (or input and output), a temporary value is
+created in the caller. The temporary value is then passed by-address. For
+output-only parameters, the temporary is uninitialized when passed (it is
+undefined behavior to not explicitly initialize an ``out`` parameter inside a
+function). For input and output parameters, the temporary is initialized from

tex3d wrote:

> (it is undefined behavior to not explicitly initialize an ``out`` parameter 
> inside a function)

Perhaps it's better to say that an out parameter has an undefined value if not 
initialized inside the function, rather than this being considered undefined 
**_behavior_**?

https://github.com/llvm/llvm-project/pull/75397
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL][Docs] Add documentation for HLSL functions (PR #75397)

2023-12-13 Thread Tex Riddell via cfe-commits


@@ -0,0 +1,316 @@
+===
+HLSL Function Calls
+===
+
+.. contents::
+   :local:
+
+Introduction
+
+
+This document describes the design and implementation of HLSL's function call
+semantics in Clang. This includes details related to argument conversion and
+parameter lifetimes.
+
+This document does not seek to serve as official documentation for HLSL's
+call semantics, but does provide an overview to assist a reader. The
+authoritative documentation for HLSL's language semantics is the `draft 
language
+specification `_.
+
+Argument Semantics
+==
+
+In HLSL, all function arguments are passed by value in and out of functions.
+HLSL has 3 keywords which denote the parameter semantics (``in``, ``out`` and
+``inout``). In a function declaration a parameter may be annotated any of the
+following ways:
+
+#.  - denotes input
+#. ``in`` - denotes input
+#. ``out`` - denotes output
+#. ``in out`` - denotes input and output
+#. ``out in`` - denotes input and output
+#. ``inout`` - denotes input and output
+
+Parameters that are exclusively input behave like C/C++ parameters that are
+passed by value.
+
+For parameters that are output (or input and output), a temporary value is
+created in the caller. The temporary value is then passed by-address. For
+output-only parameters, the temporary is uninitialized when passed (it is
+undefined behavior to not explicitly initialize an ``out`` parameter inside a
+function). For input and output parameters, the temporary is initialized from
+the lvalue argument expression through implicit or explicit casting from the
+lvalue argument type to the parameter type.
+
+On return of the function, the values of any parameter temporaries are written
+back to the argument expression through an inverted conversion sequence (if an
+``out`` parameter was not initialized in the function, the uninitialized value
+may be written back).
+
+Parameters of constant-sized array type, are also passed with value semantics.
+This requires input parameters of arrays to construct temporaries and the
+temporaries go through array-to-pointer decay when initializing parameters.
+
+Implementations are allowed to avoid unnecessary temporaries, and HLSL's strict
+no-alias rules can enable some trivial optimizations.
+
+Array Temporaries
+-
+
+Given the following example:
+
+.. code-block:: c++
+
+  void fn(float a[4]) {
+a[0] = a[1] + a[2] + a[3];
+  }
+
+  float4 main() : SV_Target {
+float arr[4] = {1, 1, 1, 1};
+fn(arr);
+return float4(a[0], a[1], a[2], a[3]);

tex3d wrote:

`return float4(a[0], a[1], a[2], a[3]);` references the wrong array.  It should 
be `arr[0], ...`, right?

https://github.com/llvm/llvm-project/pull/75397
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL][Docs] Add documentation for HLSL functions (PR #75397)

2023-12-13 Thread Tex Riddell via cfe-commits


@@ -0,0 +1,316 @@
+===
+HLSL Function Calls
+===
+
+.. contents::
+   :local:
+
+Introduction
+
+
+This document describes the design and implementation of HLSL's function call
+semantics in Clang. This includes details related to argument conversion and
+parameter lifetimes.
+
+This document does not seek to serve as official documentation for HLSL's
+call semantics, but does provide an overview to assist a reader. The
+authoritative documentation for HLSL's language semantics is the `draft 
language
+specification `_.
+
+Argument Semantics
+==
+
+In HLSL, all function arguments are passed by value in and out of functions.
+HLSL has 3 keywords which denote the parameter semantics (``in``, ``out`` and
+``inout``). In a function declaration a parameter may be annotated any of the
+following ways:
+
+#.  - denotes input
+#. ``in`` - denotes input
+#. ``out`` - denotes output
+#. ``in out`` - denotes input and output
+#. ``out in`` - denotes input and output
+#. ``inout`` - denotes input and output
+
+Parameters that are exclusively input behave like C/C++ parameters that are
+passed by value.
+
+For parameters that are output (or input and output), a temporary value is
+created in the caller. The temporary value is then passed by-address. For
+output-only parameters, the temporary is uninitialized when passed (it is
+undefined behavior to not explicitly initialize an ``out`` parameter inside a
+function). For input and output parameters, the temporary is initialized from
+the lvalue argument expression through implicit or explicit casting from the
+lvalue argument type to the parameter type.
+
+On return of the function, the values of any parameter temporaries are written
+back to the argument expression through an inverted conversion sequence (if an
+``out`` parameter was not initialized in the function, the uninitialized value
+may be written back).
+
+Parameters of constant-sized array type, are also passed with value semantics.
+This requires input parameters of arrays to construct temporaries and the
+temporaries go through array-to-pointer decay when initializing parameters.
+
+Implementations are allowed to avoid unnecessary temporaries, and HLSL's strict
+no-alias rules can enable some trivial optimizations.
+
+Array Temporaries
+-
+
+Given the following example:
+
+.. code-block:: c++
+
+  void fn(float a[4]) {
+a[0] = a[1] + a[2] + a[3];
+  }
+
+  float4 main() : SV_Target {
+float arr[4] = {1, 1, 1, 1};
+fn(arr);
+return float4(a[0], a[1], a[2], a[3]);
+  }
+
+In C or C++, the array parameter decays to a pointer, so after the call to
+``fn``, the value of ``a[0]`` is ``3``. In HLSL, the array is passed by value,

tex3d wrote:

Shouldn't this say: "the value of `arr[0]` is `3`"?  (wrong array again)

https://github.com/llvm/llvm-project/pull/75397
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL][Docs] Add documentation for HLSL functions (PR #75397)

2023-12-13 Thread Tex Riddell via cfe-commits


@@ -0,0 +1,316 @@
+===
+HLSL Function Calls
+===
+
+.. contents::
+   :local:
+
+Introduction
+
+
+This document describes the design and implementation of HLSL's function call
+semantics in Clang. This includes details related to argument conversion and
+parameter lifetimes.
+
+This document does not seek to serve as official documentation for HLSL's
+call semantics, but does provide an overview to assist a reader. The
+authoritative documentation for HLSL's language semantics is the `draft 
language
+specification `_.
+
+Argument Semantics
+==
+
+In HLSL, all function arguments are passed by value in and out of functions.
+HLSL has 3 keywords which denote the parameter semantics (``in``, ``out`` and
+``inout``). In a function declaration a parameter may be annotated any of the
+following ways:
+
+#.  - denotes input
+#. ``in`` - denotes input
+#. ``out`` - denotes output
+#. ``in out`` - denotes input and output
+#. ``out in`` - denotes input and output
+#. ``inout`` - denotes input and output
+
+Parameters that are exclusively input behave like C/C++ parameters that are
+passed by value.
+
+For parameters that are output (or input and output), a temporary value is
+created in the caller. The temporary value is then passed by-address. For
+output-only parameters, the temporary is uninitialized when passed (it is
+undefined behavior to not explicitly initialize an ``out`` parameter inside a
+function). For input and output parameters, the temporary is initialized from
+the lvalue argument expression through implicit or explicit casting from the
+lvalue argument type to the parameter type.
+
+On return of the function, the values of any parameter temporaries are written
+back to the argument expression through an inverted conversion sequence (if an
+``out`` parameter was not initialized in the function, the uninitialized value
+may be written back).
+
+Parameters of constant-sized array type, are also passed with value semantics.
+This requires input parameters of arrays to construct temporaries and the
+temporaries go through array-to-pointer decay when initializing parameters.
+
+Implementations are allowed to avoid unnecessary temporaries, and HLSL's strict
+no-alias rules can enable some trivial optimizations.
+
+Array Temporaries
+-
+
+Given the following example:
+
+.. code-block:: c++
+
+  void fn(float a[4]) {
+a[0] = a[1] + a[2] + a[3];
+  }
+
+  float4 main() : SV_Target {
+float arr[4] = {1, 1, 1, 1};
+fn(arr);
+return float4(a[0], a[1], a[2], a[3]);
+  }
+
+In C or C++, the array parameter decays to a pointer, so after the call to
+``fn``, the value of ``a[0]`` is ``3``. In HLSL, the array is passed by value,
+so modifications inside ``fn`` do not propagate out.
+
+.. note::
+
+  DXC supports unsized arrays passed directly as decayed pointers, which is an
+  unfortunate behavior divergence.

tex3d wrote:

Well, not reliably.  This area is weird and buggy, in both DXC and FXC.  This 
was originally supported in DXC because it appeared that FXC supported it.  
However, support of unsized array parameters in FXC was limited to resource 
arrays (and only on SM 5.1).  These can also be buggy in FXC, with differences 
in behavior between SM 5.0 and SM 5.1.

In DXC, use of unsized arrays can lead to asserts or crashes.  The primary area 
where they are likely to be useful is when passing an unbounded resource array 
to a function, where you do not want a copy of the resource handles, but want 
the original resource range declared globally to be passed to a function and 
indexed there.  This scenario works for SM 5.1 in FXC, but appears to assert in 
DXC.

For these reasons, we should consider removing support for them in DXC.

https://github.com/llvm/llvm-project/pull/75397
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL][Docs] Add documentation for HLSL functions (PR #75397)

2023-12-13 Thread Tex Riddell via cfe-commits


@@ -0,0 +1,316 @@
+===
+HLSL Function Calls
+===
+
+.. contents::
+   :local:
+
+Introduction
+
+
+This document describes the design and implementation of HLSL's function call
+semantics in Clang. This includes details related to argument conversion and
+parameter lifetimes.
+
+This document does not seek to serve as official documentation for HLSL's
+call semantics, but does provide an overview to assist a reader. The
+authoritative documentation for HLSL's language semantics is the `draft 
language
+specification `_.
+
+Argument Semantics
+==
+
+In HLSL, all function arguments are passed by value in and out of functions.
+HLSL has 3 keywords which denote the parameter semantics (``in``, ``out`` and
+``inout``). In a function declaration a parameter may be annotated any of the
+following ways:
+
+#.  - denotes input
+#. ``in`` - denotes input
+#. ``out`` - denotes output
+#. ``in out`` - denotes input and output
+#. ``out in`` - denotes input and output
+#. ``inout`` - denotes input and output
+
+Parameters that are exclusively input behave like C/C++ parameters that are
+passed by value.
+
+For parameters that are output (or input and output), a temporary value is
+created in the caller. The temporary value is then passed by-address. For
+output-only parameters, the temporary is uninitialized when passed (if the
+parameter is not explicitly initialized inside the function an undefined value
+is stored back to the argument expression). For input and output parameters, 
the
+temporary is initialized from  the lvalue argument expression through implicit
+or explicit casting from the lvalue argument type to the parameter type.
+
+On return of the function, the values of any parameter temporaries are written
+back to the argument expression through an inverted conversion sequence (if an
+``out`` parameter was not initialized in the function, the uninitialized value
+may be written back).
+
+Parameters of constant-sized array type, are also passed with value semantics.
+This requires input parameters of arrays to construct temporaries and the
+temporaries go through array-to-pointer decay when initializing parameters.
+
+Implementations are allowed to avoid unnecessary temporaries, and HLSL's strict
+no-alias rules can enable some trivial optimizations.
+
+Array Temporaries
+-
+
+Given the following example:
+
+.. code-block:: c++
+
+  void fn(float a[4]) {
+a[0] = a[1] + a[2] + a[3];
+  }
+
+  float4 main() : SV_Target {
+float arr[4] = {1, 1, 1, 1};
+fn(arr);
+return float4(arr[0], arr[1], arr[2], arr[3]);
+  }
+
+In C or C++, the array parameter decays to a pointer, so after the call to
+``fn``, the value of ``arr[0]`` is ``3``. In HLSL, the array is passed by 
value,
+so modifications inside ``fn`` do not propagate out.
+
+.. note::
+
+  DXC supports unsized arrays passed directly as decayed pointers, which is an
+  unfortunate behavior divergence.
+
+Out Parameter Temporaries
+-
+
+.. code-block:: c++
+
+  void Init(inout int X, inout int Y) {
+Y = 2;
+X = 1;
+  }
+
+  void main() {
+int V;
+Init(V, V); // MSVC ABI V == 2, Itanium V == 1
+  }
+
+In the above example the ``Init`` function's behavior depends on the C++ ABI
+implementation. In the MSVC C++ ABI (used for the HLSL DXIL target), call
+arguments are emitted right-to-left and destroyed left-to-right. This means 
that
+the parameter initialization and destruction occurs in the order: {``Y``,
+``X``, ``~X``, ``~Y``}. This causes the write-back of the value of ``Y`` to 
occur
+last, so the resulting value of ``V`` is ``2``. In the Itanium C++ ABI, the
+parameter ordering is reversed, so the initialization and destruction occurs in
+the order: {``X``, ``Y``, ``~Y``, ``X``}. This causes the write-back of the
+value ``X`` to occur last, resulting in the value of ``V`` being set to ``1``.
+
+.. code-block:: c++
+
+  void Trunc(inout int3 V) { }
+
+
+  void main() {
+float3 F = {1.5, 2.6, 3.3};
+Trunc(F); // F == {1.0, 2.0, 3.0}
+  }
+
+In the above example, the argument expression ``F`` undergoes element-wise
+conversion from a float vector to an integer vector to create a temporary
+``int3``. On expiration the temporary undergoes elementwise conversion back to
+the floating point vector type ``float3``. This results in an implicit
+truncation of the vector even if the value is unused in the function.

tex3d wrote:

This results in an implicit float to int conversion for each component of the 
vector, not a vector truncation (which would be a reduction in the size of the 
vector).

https://github.com/llvm/llvm-project/pull/75397
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL][Docs] Add documentation for HLSL functions (PR #75397)

2023-12-13 Thread Tex Riddell via cfe-commits


@@ -0,0 +1,316 @@
+===
+HLSL Function Calls
+===
+
+.. contents::
+   :local:
+
+Introduction
+
+
+This document describes the design and implementation of HLSL's function call
+semantics in Clang. This includes details related to argument conversion and
+parameter lifetimes.
+
+This document does not seek to serve as official documentation for HLSL's
+call semantics, but does provide an overview to assist a reader. The
+authoritative documentation for HLSL's language semantics is the `draft 
language
+specification `_.
+
+Argument Semantics
+==
+
+In HLSL, all function arguments are passed by value in and out of functions.
+HLSL has 3 keywords which denote the parameter semantics (``in``, ``out`` and
+``inout``). In a function declaration a parameter may be annotated any of the
+following ways:
+
+#.  - denotes input
+#. ``in`` - denotes input
+#. ``out`` - denotes output
+#. ``in out`` - denotes input and output
+#. ``out in`` - denotes input and output
+#. ``inout`` - denotes input and output
+
+Parameters that are exclusively input behave like C/C++ parameters that are
+passed by value.
+
+For parameters that are output (or input and output), a temporary value is
+created in the caller. The temporary value is then passed by-address. For
+output-only parameters, the temporary is uninitialized when passed (if the
+parameter is not explicitly initialized inside the function an undefined value
+is stored back to the argument expression). For input and output parameters, 
the
+temporary is initialized from  the lvalue argument expression through implicit
+or explicit casting from the lvalue argument type to the parameter type.
+
+On return of the function, the values of any parameter temporaries are written
+back to the argument expression through an inverted conversion sequence (if an
+``out`` parameter was not initialized in the function, the uninitialized value
+may be written back).
+
+Parameters of constant-sized array type, are also passed with value semantics.
+This requires input parameters of arrays to construct temporaries and the
+temporaries go through array-to-pointer decay when initializing parameters.
+
+Implementations are allowed to avoid unnecessary temporaries, and HLSL's strict
+no-alias rules can enable some trivial optimizations.
+
+Array Temporaries
+-
+
+Given the following example:
+
+.. code-block:: c++
+
+  void fn(float a[4]) {
+a[0] = a[1] + a[2] + a[3];
+  }
+
+  float4 main() : SV_Target {
+float arr[4] = {1, 1, 1, 1};
+fn(arr);
+return float4(arr[0], arr[1], arr[2], arr[3]);
+  }
+
+In C or C++, the array parameter decays to a pointer, so after the call to
+``fn``, the value of ``arr[0]`` is ``3``. In HLSL, the array is passed by 
value,
+so modifications inside ``fn`` do not propagate out.
+
+.. note::
+
+  DXC supports unsized arrays passed directly as decayed pointers, which is an
+  unfortunate behavior divergence.
+
+Out Parameter Temporaries
+-
+
+.. code-block:: c++
+
+  void Init(inout int X, inout int Y) {
+Y = 2;
+X = 1;
+  }
+
+  void main() {
+int V;
+Init(V, V); // MSVC ABI V == 2, Itanium V == 1
+  }
+
+In the above example the ``Init`` function's behavior depends on the C++ ABI
+implementation. In the MSVC C++ ABI (used for the HLSL DXIL target), call
+arguments are emitted right-to-left and destroyed left-to-right. This means 
that
+the parameter initialization and destruction occurs in the order: {``Y``,
+``X``, ``~X``, ``~Y``}. This causes the write-back of the value of ``Y`` to 
occur
+last, so the resulting value of ``V`` is ``2``. In the Itanium C++ ABI, the
+parameter ordering is reversed, so the initialization and destruction occurs in
+the order: {``X``, ``Y``, ``~Y``, ``X``}. This causes the write-back of the
+value ``X`` to occur last, resulting in the value of ``V`` being set to ``1``.
+
+.. code-block:: c++
+
+  void Trunc(inout int3 V) { }
+
+
+  void main() {
+float3 F = {1.5, 2.6, 3.3};
+Trunc(F); // F == {1.0, 2.0, 3.0}
+  }
+
+In the above example, the argument expression ``F`` undergoes element-wise
+conversion from a float vector to an integer vector to create a temporary
+``int3``. On expiration the temporary undergoes elementwise conversion back to
+the floating point vector type ``float3``. This results in an implicit
+truncation of the vector even if the value is unused in the function.
+
+
+.. code-block:: c++
+
+  void UB(out int X) {}
+
+  void main() {
+int X = 7;
+UB(X); // X is undefined!
+  }
+
+In this example an initialized value is passed to an ``out`` parameter.
+Parameters marked ``out`` are not initialized by the argument expression or
+implicitly by the function. They must be explicitly initialized. In this case
+the argument is not initialized in the function so the temporary is still
+uninitialized when i

[clang] [llvm] [clang][hlsl] Add atan2 intrinsic part 1 (PR #107923)

2024-09-09 Thread Tex Riddell via cfe-commits

https://github.com/tex3d created 
https://github.com/llvm/llvm-project/pull/107923

Issue: #70096

Changes:
- Doc updates:
  - `clang/docs/LanguageExtensions.rst` - Document the new elementwise atan2 
builtin.
  - `llvm/docs/LangRef.rst` - Document the atan2 intrinsic
- TableGen:
  - `clang/include/clang/Basic/Builtins.td` - Implement the atan2 builtin.
  - `llvm/include/llvm/IR/Intrinsics.td` - Create the atan2 intrinsic
- Sema checking:
  - `clang/lib/Sema/SemaChecking.cpp` - Add generic sema checks to the atan2 
builtin
  - `clang/lib/Sema/SemaHLSL` Add HLSL specifc sema checks to the atan2 builtin
- `clang/lib/CodeGen/CGBuiltin.cpp` - invoke the atan2 intrinsic on uses of the 
builtin
- `clang/lib/Headers/hlsl/hlsl_intrinsics.h` - Associate the atan2 builtin with 
the equivalent hlsl apis

This change is an implementation of #87367's investigation on supporting IEEE 
math operations as intrinsics.
Which was discussed in this RFC:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

>From af680b05eb1c83810d6eb044144283ac66d232ec Mon Sep 17 00:00:00 2001
From: Tex Riddell 
Date: Mon, 9 Sep 2024 14:39:18 -0700
Subject: [PATCH] [clang][hlsl] Add atan2 intrinsic part 1

Issue: #70096

Changes:
- Doc updates:
  - `clang/docs/LanguageExtensions.rst` - Document the new elementwise atan2 
builtin.
  - `llvm/docs/LangRef.rst` - Document the atan2 intrinsic
- TableGen:
  - `clang/include/clang/Basic/Builtins.td` - Implement the atan2 builtin.
  - `llvm/include/llvm/IR/Intrinsics.td` - Create the atan2 intrinsic
- Sema checking:
  - `clang/lib/Sema/SemaChecking.cpp` - Add generic sema checks to the atan2 
builtin
  - `clang/lib/Sema/SemaHLSL` Add HLSL specifc sema checks to the atan2 builtin
- `clang/lib/CodeGen/CGBuiltin.cpp` - invoke the atan2 intrinsic on uses of the 
builtin
- `clang/lib/Headers/hlsl/hlsl_intrinsics.h` - Associate the atan2 builtin with 
the equivalent hlsl apis
---
 clang/docs/LanguageExtensions.rst |  1 +
 clang/include/clang/Basic/Builtins.td |  6 ++
 clang/lib/CodeGen/CGBuiltin.cpp   |  3 +
 clang/lib/Headers/hlsl/hlsl_intrinsics.h  | 30 ++
 clang/lib/Sema/SemaChecking.cpp   |  1 +
 clang/lib/Sema/SemaHLSL.cpp   |  1 +
 .../test/CodeGen/builtins-elementwise-math.c  | 20 +++
 .../CodeGen/strictfp-elementwise-bulitins.cpp | 10 
 clang/test/CodeGenHLSL/builtins/atan2.hlsl| 59 +++
 clang/test/Sema/aarch64-sve-vector-trig-ops.c |  6 ++
 clang/test/Sema/builtins-elementwise-math.c   | 24 
 clang/test/Sema/riscv-rvv-vector-trig-ops.c   |  6 ++
 .../SemaCXX/builtins-elementwise-math.cpp |  7 +++
 .../BuiltIns/half-float-only-errors2.hlsl |  7 +++
 llvm/docs/LangRef.rst | 37 
 llvm/include/llvm/IR/Intrinsics.td|  1 +
 16 files changed, 219 insertions(+)
 create mode 100644 clang/test/CodeGenHLSL/builtins/atan2.hlsl
 create mode 100644 clang/test/SemaHLSL/BuiltIns/half-float-only-errors2.hlsl

diff --git a/clang/docs/LanguageExtensions.rst 
b/clang/docs/LanguageExtensions.rst
index 62903fc3744cad..169c6e17fb7f41 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -660,6 +660,7 @@ Unless specified otherwise operation(±0) = ±0 and 
operation(±infinity) = ±in
  T __builtin_elementwise_asin(T x)   return the arcsine of x 
interpreted as an angle in radians   floating point types
  T __builtin_elementwise_acos(T x)   return the arccosine of x 
interpreted as an angle in radians floating point types
  T __builtin_elementwise_atan(T x)   return the arctangent of x 
interpreted as an angle in radiansfloating point types
+ T __builtin_elementwise_atan2(T y, T x) return the arctangent of y/x  
   floating point types
  T __builtin_elementwise_sinh(T x)   return the hyperbolic sine of 
angle x in radians floating point types
  T __builtin_elementwise_cosh(T x)   return the hyperbolic cosine of 
angle x in radians   floating point types
  T __builtin_elementwise_tanh(T x)   return the hyperbolic tangent of 
angle x in radians  floating point types
diff --git a/clang/include/clang/Basic/Builtins.td 
b/clang/include/clang/Basic/Builtins.td
index ac33672a32b336..1b4df10eccca94 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -1244,6 +1244,12 @@ def ElementwiseATan : Builtin {
   let Prototype = "void(...)";
 }
 
+def ElementwiseATan2 : Builtin {
+  let Spellings = ["__builtin_elementwise_atan2"];
+  let Attributes = [NoThrow, Const, CustomTypeChecking];
+  let Prototype = "void(...)";
+}
+
 def ElementwiseBitreverse : Builtin {
   let Spellings = ["__builtin_elementwise_bitreverse"];
   let Attributes = [NoThrow, Const, CustomTypeChecking];
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
inde

[clang] [llvm] [clang][hlsl] Add atan2 intrinsic part 1 (PR #107923)

2024-09-09 Thread Tex Riddell via cfe-commits

https://github.com/tex3d updated 
https://github.com/llvm/llvm-project/pull/107923

>From af680b05eb1c83810d6eb044144283ac66d232ec Mon Sep 17 00:00:00 2001
From: Tex Riddell 
Date: Mon, 9 Sep 2024 14:39:18 -0700
Subject: [PATCH 1/2] [clang][hlsl] Add atan2 intrinsic part 1

Issue: #70096

Changes:
- Doc updates:
  - `clang/docs/LanguageExtensions.rst` - Document the new elementwise atan2 
builtin.
  - `llvm/docs/LangRef.rst` - Document the atan2 intrinsic
- TableGen:
  - `clang/include/clang/Basic/Builtins.td` - Implement the atan2 builtin.
  - `llvm/include/llvm/IR/Intrinsics.td` - Create the atan2 intrinsic
- Sema checking:
  - `clang/lib/Sema/SemaChecking.cpp` - Add generic sema checks to the atan2 
builtin
  - `clang/lib/Sema/SemaHLSL` Add HLSL specifc sema checks to the atan2 builtin
- `clang/lib/CodeGen/CGBuiltin.cpp` - invoke the atan2 intrinsic on uses of the 
builtin
- `clang/lib/Headers/hlsl/hlsl_intrinsics.h` - Associate the atan2 builtin with 
the equivalent hlsl apis
---
 clang/docs/LanguageExtensions.rst |  1 +
 clang/include/clang/Basic/Builtins.td |  6 ++
 clang/lib/CodeGen/CGBuiltin.cpp   |  3 +
 clang/lib/Headers/hlsl/hlsl_intrinsics.h  | 30 ++
 clang/lib/Sema/SemaChecking.cpp   |  1 +
 clang/lib/Sema/SemaHLSL.cpp   |  1 +
 .../test/CodeGen/builtins-elementwise-math.c  | 20 +++
 .../CodeGen/strictfp-elementwise-bulitins.cpp | 10 
 clang/test/CodeGenHLSL/builtins/atan2.hlsl| 59 +++
 clang/test/Sema/aarch64-sve-vector-trig-ops.c |  6 ++
 clang/test/Sema/builtins-elementwise-math.c   | 24 
 clang/test/Sema/riscv-rvv-vector-trig-ops.c   |  6 ++
 .../SemaCXX/builtins-elementwise-math.cpp |  7 +++
 .../BuiltIns/half-float-only-errors2.hlsl |  7 +++
 llvm/docs/LangRef.rst | 37 
 llvm/include/llvm/IR/Intrinsics.td|  1 +
 16 files changed, 219 insertions(+)
 create mode 100644 clang/test/CodeGenHLSL/builtins/atan2.hlsl
 create mode 100644 clang/test/SemaHLSL/BuiltIns/half-float-only-errors2.hlsl

diff --git a/clang/docs/LanguageExtensions.rst 
b/clang/docs/LanguageExtensions.rst
index 62903fc3744cad..169c6e17fb7f41 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -660,6 +660,7 @@ Unless specified otherwise operation(±0) = ±0 and 
operation(±infinity) = ±in
  T __builtin_elementwise_asin(T x)   return the arcsine of x 
interpreted as an angle in radians   floating point types
  T __builtin_elementwise_acos(T x)   return the arccosine of x 
interpreted as an angle in radians floating point types
  T __builtin_elementwise_atan(T x)   return the arctangent of x 
interpreted as an angle in radiansfloating point types
+ T __builtin_elementwise_atan2(T y, T x) return the arctangent of y/x  
   floating point types
  T __builtin_elementwise_sinh(T x)   return the hyperbolic sine of 
angle x in radians floating point types
  T __builtin_elementwise_cosh(T x)   return the hyperbolic cosine of 
angle x in radians   floating point types
  T __builtin_elementwise_tanh(T x)   return the hyperbolic tangent of 
angle x in radians  floating point types
diff --git a/clang/include/clang/Basic/Builtins.td 
b/clang/include/clang/Basic/Builtins.td
index ac33672a32b336..1b4df10eccca94 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -1244,6 +1244,12 @@ def ElementwiseATan : Builtin {
   let Prototype = "void(...)";
 }
 
+def ElementwiseATan2 : Builtin {
+  let Spellings = ["__builtin_elementwise_atan2"];
+  let Attributes = [NoThrow, Const, CustomTypeChecking];
+  let Prototype = "void(...)";
+}
+
 def ElementwiseBitreverse : Builtin {
   let Spellings = ["__builtin_elementwise_bitreverse"];
   let Attributes = [NoThrow, Const, CustomTypeChecking];
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 2a733e4d834cfa..856d975e52e413 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -3797,6 +3797,9 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl 
GD, unsigned BuiltinID,
   case Builtin::BI__builtin_elementwise_atan:
 return RValue::get(emitBuiltinWithOneOverloadedType<1>(
 *this, E, llvm::Intrinsic::atan, "elt.atan"));
+  case Builtin::BI__builtin_elementwise_atan2:
+return RValue::get(emitBuiltinWithOneOverloadedType<2>(
+*this, E, llvm::Intrinsic::atan2, "elt.atan2"));
   case Builtin::BI__builtin_elementwise_ceil:
 return RValue::get(emitBuiltinWithOneOverloadedType<1>(
 *this, E, llvm::Intrinsic::ceil, "elt.ceil"));
diff --git a/clang/lib/Headers/hlsl/hlsl_intrinsics.h 
b/clang/lib/Headers/hlsl/hlsl_intrinsics.h
index 6d38b668fe770e..f4edcf87534a06 100644
--- a/clang/lib/Headers/hlsl/hlsl_intrinsics.h
+++ b/clang/lib/Head

[clang] [llvm] [clang][hlsl] Add atan2 intrinsic part 1 (PR #107923)

2024-09-09 Thread Tex Riddell via cfe-commits

https://github.com/tex3d updated 
https://github.com/llvm/llvm-project/pull/107923

>From 44b355687a3e148bfe3d5e4f95efd39363b58b07 Mon Sep 17 00:00:00 2001
From: Tex Riddell 
Date: Mon, 9 Sep 2024 14:39:18 -0700
Subject: [PATCH 1/2] [clang][hlsl] Add atan2 intrinsic part 1

Issue: #70096

Changes:
- Doc updates:
  - `clang/docs/LanguageExtensions.rst` - Document the new elementwise atan2 
builtin.
  - `llvm/docs/LangRef.rst` - Document the atan2 intrinsic
- TableGen:
  - `clang/include/clang/Basic/Builtins.td` - Implement the atan2 builtin.
  - `llvm/include/llvm/IR/Intrinsics.td` - Create the atan2 intrinsic
- Sema checking:
  - `clang/lib/Sema/SemaChecking.cpp` - Add generic sema checks to the atan2 
builtin
  - `clang/lib/Sema/SemaHLSL` Add HLSL specifc sema checks to the atan2 builtin
- `clang/lib/CodeGen/CGBuiltin.cpp` - invoke the atan2 intrinsic on uses of the 
builtin
- `clang/lib/Headers/hlsl/hlsl_intrinsics.h` - Associate the atan2 builtin with 
the equivalent hlsl apis
---
 clang/docs/LanguageExtensions.rst |  1 +
 clang/include/clang/Basic/Builtins.td |  6 ++
 clang/lib/CodeGen/CGBuiltin.cpp   |  3 +
 clang/lib/Headers/hlsl/hlsl_intrinsics.h  | 30 ++
 clang/lib/Sema/SemaChecking.cpp   |  1 +
 clang/lib/Sema/SemaHLSL.cpp   |  1 +
 .../test/CodeGen/builtins-elementwise-math.c  | 20 +++
 .../CodeGen/strictfp-elementwise-bulitins.cpp | 10 
 clang/test/CodeGenHLSL/builtins/atan2.hlsl| 59 +++
 clang/test/Sema/aarch64-sve-vector-trig-ops.c |  6 ++
 clang/test/Sema/builtins-elementwise-math.c   | 24 
 clang/test/Sema/riscv-rvv-vector-trig-ops.c   |  6 ++
 .../SemaCXX/builtins-elementwise-math.cpp |  7 +++
 .../BuiltIns/half-float-only-errors2.hlsl |  7 +++
 llvm/docs/LangRef.rst | 37 
 llvm/include/llvm/IR/Intrinsics.td|  1 +
 16 files changed, 219 insertions(+)
 create mode 100644 clang/test/CodeGenHLSL/builtins/atan2.hlsl
 create mode 100644 clang/test/SemaHLSL/BuiltIns/half-float-only-errors2.hlsl

diff --git a/clang/docs/LanguageExtensions.rst 
b/clang/docs/LanguageExtensions.rst
index c08697282cbfe8..dd4a14e88394e9 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -660,6 +660,7 @@ Unless specified otherwise operation(±0) = ±0 and 
operation(±infinity) = ±in
  T __builtin_elementwise_asin(T x)   return the arcsine of x 
interpreted as an angle in radians   floating point types
  T __builtin_elementwise_acos(T x)   return the arccosine of x 
interpreted as an angle in radians floating point types
  T __builtin_elementwise_atan(T x)   return the arctangent of x 
interpreted as an angle in radiansfloating point types
+ T __builtin_elementwise_atan2(T y, T x) return the arctangent of y/x  
   floating point types
  T __builtin_elementwise_sinh(T x)   return the hyperbolic sine of 
angle x in radians floating point types
  T __builtin_elementwise_cosh(T x)   return the hyperbolic cosine of 
angle x in radians   floating point types
  T __builtin_elementwise_tanh(T x)   return the hyperbolic tangent of 
angle x in radians  floating point types
diff --git a/clang/include/clang/Basic/Builtins.td 
b/clang/include/clang/Basic/Builtins.td
index d9833b6559eab3..38f3083348b4dd 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -1250,6 +1250,12 @@ def ElementwiseATan : Builtin {
   let Prototype = "void(...)";
 }
 
+def ElementwiseATan2 : Builtin {
+  let Spellings = ["__builtin_elementwise_atan2"];
+  let Attributes = [NoThrow, Const, CustomTypeChecking];
+  let Prototype = "void(...)";
+}
+
 def ElementwiseBitreverse : Builtin {
   let Spellings = ["__builtin_elementwise_bitreverse"];
   let Attributes = [NoThrow, Const, CustomTypeChecking];
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 0078ceb7e892af..94e6448c7754e7 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -3800,6 +3800,9 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl 
GD, unsigned BuiltinID,
   case Builtin::BI__builtin_elementwise_atan:
 return RValue::get(emitBuiltinWithOneOverloadedType<1>(
 *this, E, llvm::Intrinsic::atan, "elt.atan"));
+  case Builtin::BI__builtin_elementwise_atan2:
+return RValue::get(emitBuiltinWithOneOverloadedType<2>(
+*this, E, llvm::Intrinsic::atan2, "elt.atan2"));
   case Builtin::BI__builtin_elementwise_ceil:
 return RValue::get(emitBuiltinWithOneOverloadedType<1>(
 *this, E, llvm::Intrinsic::ceil, "elt.ceil"));
diff --git a/clang/lib/Headers/hlsl/hlsl_intrinsics.h 
b/clang/lib/Headers/hlsl/hlsl_intrinsics.h
index 2ac18056b0fc3d..e80ff2c00d9b50 100644
--- a/clang/lib/Headers/hlsl/hlsl_intrinsics.h
+++ b/clang/lib/Head

[clang] [llvm] [clang][hlsl] Add atan2 intrinsic part 1 (PR #107923)

2024-09-09 Thread Tex Riddell via cfe-commits

tex3d wrote:

@farzonl 
> since this is a new builtin it would make sense to add it to 
> `clang/docs/ReleaseNotes.rst`. There should be some examples from Josh's PRs.

I can't find the examples you're referring to.  Do you think you could point 
one out?  Also, what about the other builtins that were added previously, 
related to the same RFC?

https://github.com/llvm/llvm-project/pull/107923
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] add loop unroll (PR #93879)

2024-06-03 Thread Tex Riddell via cfe-commits


@@ -635,6 +635,17 @@ void LoopInfoStack::push(BasicBlock *Header, 
clang::ASTContext &Ctx,
 Option = LoopHintAttr::UnrollCount;
 State = LoopHintAttr::Numeric;
   }
+} else if (HLSLLoopHint) {
+  ValueInt = HLSLLoopHint->getDirective();
+  if (HLSLLoopHint->getSemanticSpelling() ==
+  HLSLLoopHintAttr::Spelling::Microsoft_unroll) {
+if (ValueInt == 0)
+  State = LoopHintAttr::Enable;
+if (ValueInt > 0) {
+  Option = LoopHintAttr::UnrollCount;

tex3d wrote:

You should think of the HLSL argument as manually specifying the maximum 
iterations for the loop.

https://github.com/llvm/llvm-project/pull/93879
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] Correctly set `__HLSL_ENABLE_16_BIT` (PR #89788)

2024-04-23 Thread Tex Riddell via cfe-commits

https://github.com/tex3d approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/89788
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [DirectX] Set DXIL Version using shader model version in compilation target profile (PR #89823)

2024-04-23 Thread Tex Riddell via cfe-commits


@@ -68,25 +68,25 @@ TEST(DxcModeTest, TargetProfileValidation) {
   IntrusiveRefCntPtr DiagOpts = new DiagnosticOptions();
   DiagnosticsEngine Diags(DiagID, &*DiagOpts, DiagConsumer);
 
-  validateTargetProfile("-Tvs_6_0", "dxil--shadermodel6.0-vertex",
+  validateTargetProfile("-Tvs_6_0", "dxilv1.0--shadermodel6.0-vertex",
 InMemoryFileSystem, Diags);
-  validateTargetProfile("-Ths_6_1", "dxil--shadermodel6.1-hull",
+  validateTargetProfile("-Ths_6_1", "dxilv1.1--shadermodel6.1-hull",
 InMemoryFileSystem, Diags);
-  validateTargetProfile("-Tds_6_2", "dxil--shadermodel6.2-domain",
+  validateTargetProfile("-Tds_6_2", "dxilv1.2--shadermodel6.2-domain",
 InMemoryFileSystem, Diags);
-  validateTargetProfile("-Tds_6_2", "dxil--shadermodel6.2-domain",
+  validateTargetProfile("-Tds_6_2", "dxilv1.2--shadermodel6.2-domain",
 InMemoryFileSystem, Diags);
-  validateTargetProfile("-Tgs_6_3", "dxil--shadermodel6.3-geometry",
+  validateTargetProfile("-Tgs_6_3", "dxilv1.3--shadermodel6.3-geometry",
 InMemoryFileSystem, Diags);
-  validateTargetProfile("-Tps_6_4", "dxil--shadermodel6.4-pixel",
+  validateTargetProfile("-Tps_6_4", "dxilv1.4--shadermodel6.4-pixel",
 InMemoryFileSystem, Diags);
-  validateTargetProfile("-Tcs_6_5", "dxil--shadermodel6.5-compute",
+  validateTargetProfile("-Tcs_6_5", "dxilv1.5--shadermodel6.5-compute",
 InMemoryFileSystem, Diags);
-  validateTargetProfile("-Tms_6_6", "dxil--shadermodel6.6-mesh",
+  validateTargetProfile("-Tms_6_6", "dxilv1.6--shadermodel6.6-mesh",
 InMemoryFileSystem, Diags);
-  validateTargetProfile("-Tas_6_7", "dxil--shadermodel6.7-amplification",
+  validateTargetProfile("-Tas_6_7", "dxilv1.7--shadermodel6.7-amplification",
 InMemoryFileSystem, Diags);
-  validateTargetProfile("-Tlib_6_x", "dxil--shadermodel6.15-library",
+  validateTargetProfile("-Tlib_6_8", "dxilv1.8--shadermodel6.8-library",

tex3d wrote:

There is, in fact, a lib_6_x target that maps to a more unconstrained offline 
linking target (with minor version = 15) which is not expected to be compatible 
between compiler versions.  Do we care to model that in clang at this time?  
This change removes this target, so I just wanted to know if that was a 
deliberate decision.

https://github.com/llvm/llvm-project/pull/89823
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Implement resource binding type prefix mismatch errors (PR #87578)

2024-04-23 Thread Tex Riddell via cfe-commits


@@ -0,0 +1,74 @@
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -x hlsl -o - 
-fsyntax-only %s -verify
+
+// the below will cause an llvm unreachable, because RWBuffers don't have 
resource attributes yet
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'b' for register type 
'RWBuffer' (expected 'u')}}
+// NOT YET IMPLEMENTED RWBuffer a : register(b2, space1);
+
+// the below will cause an llvm unreachable, because RWBuffers don't have 
resource attributes yet
+// NOT YET IMPLEMENTED : {{invalid register name prefix 't' for register type 
'RWBuffer' (expected 'u')}}
+// NOT YET IMPLEMENTED RWBuffer b : register(t2, space1);
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'u' for register type 
'Texture1D' (expected 't')}}
+// NOT YET IMPLEMENTED Texture1D tex : register(u3);
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 's' for register type 
'Texture2D' (expected 't')}}
+// NOT YET IMPLEMENTED Texture2D Texture : register(s0);
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'u' for register type 
'Texture2DMS' (expected 't')}}
+// NOT YET IMPLEMENTED Texture2DMS T2DMS_t2 : register(u2)
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 't' for register type 
'RWTexture3D' (expected 'u')}}
+// NOT YET IMPLEMENTED RWTexture3D RWT3D_u1 : register(t1)
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'b' for register type 
'Texture2DMS' (expected 't' or 's')}}
+// NOT YET IMPLEMENTED TextureCube TCube_b2 : register(B2);
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'b' for register type 
'Texture2DMS' (expected 't')}}
+// NOT YET IMPLEMENTED TextureCubeArray TCubeArray_t2 : register(b2);
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'b' for register type 
'Texture2DMS' (expected 't' or 's')}}
+// NOT YET IMPLEMENTED Texture1DArray T1DArray_t2 : register(b2);
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'u' for register type 
'Texture2DMS' (expected 't' or 's')}}
+// NOT YET IMPLEMENTED Texture2DArray T2DArray_b2 : register(B2);
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'u' for register type 
'Texture2DMS' (expected 'b' or 'c' or 'i')}}
+// NOT YET IMPLEMENTED Texture2DMSArray msTextureArray : register(t2, 
space2);
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'u' for register type 
'TCubeArray_f2' (expected 't' or 's')}}
+// NOT YET IMPLEMENTED TextureCubeArray TCubeArray_f2 : register(u2);
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'u' for register type 
'TypedBuffer' (expected 't')}}
+// NOT YET IMPLEMENTED TypedBuffer tbuf : register(u2);
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'u' for register type 
'RawBuffer' (expected 't')}}
+// NOT YET IMPLEMENTED RawBuffer rbuf : register(u2);
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 't' for register type 
'StructuredBuffer' (expected 'u')}}
+// NOT YET IMPLEMENTED StructuredBuffer ROVStructuredBuff_t2  : register(T2);
+
+// expected-error@+1 {{invalid register name prefix 's' for register type 
'cbuffer' (expected 'b')}}
+cbuffer f : register(s2, space1) {}
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 't' for register type 
'Sampler' (expected 's')}}
+// Can this type just be Sampler instead of SamplerState?
+// NOT YET IMPLEMENTED SamplerState MySampler : register(t3, space1);
+
+// expected-error@+1 {{invalid register name prefix 's' for register type 
'tbuffer' (expected 'b')}}
+tbuffer f : register(s2, space1) {}

tex3d wrote:

As was stated in an earlier (resolved but seemingly not fixed) comment, tbuffer 
must be bound to `t`.

https://github.com/llvm/llvm-project/pull/87578
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Implement resource binding type prefix mismatch errors (PR #87578)

2024-04-23 Thread Tex Riddell via cfe-commits


@@ -0,0 +1,74 @@
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -x hlsl -o - 
-fsyntax-only %s -verify
+
+// the below will cause an llvm unreachable, because RWBuffers don't have 
resource attributes yet
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'b' for register type 
'RWBuffer' (expected 'u')}}
+// NOT YET IMPLEMENTED RWBuffer a : register(b2, space1);
+
+// the below will cause an llvm unreachable, because RWBuffers don't have 
resource attributes yet
+// NOT YET IMPLEMENTED : {{invalid register name prefix 't' for register type 
'RWBuffer' (expected 'u')}}
+// NOT YET IMPLEMENTED RWBuffer b : register(t2, space1);
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'u' for register type 
'Texture1D' (expected 't')}}
+// NOT YET IMPLEMENTED Texture1D tex : register(u3);
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 's' for register type 
'Texture2D' (expected 't')}}
+// NOT YET IMPLEMENTED Texture2D Texture : register(s0);
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'u' for register type 
'Texture2DMS' (expected 't')}}
+// NOT YET IMPLEMENTED Texture2DMS T2DMS_t2 : register(u2)
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 't' for register type 
'RWTexture3D' (expected 'u')}}
+// NOT YET IMPLEMENTED RWTexture3D RWT3D_u1 : register(t1)
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'b' for register type 
'Texture2DMS' (expected 't' or 's')}}
+// NOT YET IMPLEMENTED TextureCube TCube_b2 : register(B2);

tex3d wrote:

's' should not be considered a valid register binding for TextureCube, and the 
message says 'Texture2DMS' instead of 'TextureCube'.

https://github.com/llvm/llvm-project/pull/87578
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Implement resource binding type prefix mismatch errors (PR #87578)

2024-04-23 Thread Tex Riddell via cfe-commits


@@ -0,0 +1,74 @@
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -x hlsl -o - 
-fsyntax-only %s -verify
+
+// the below will cause an llvm unreachable, because RWBuffers don't have 
resource attributes yet
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'b' for register type 
'RWBuffer' (expected 'u')}}
+// NOT YET IMPLEMENTED RWBuffer a : register(b2, space1);
+
+// the below will cause an llvm unreachable, because RWBuffers don't have 
resource attributes yet
+// NOT YET IMPLEMENTED : {{invalid register name prefix 't' for register type 
'RWBuffer' (expected 'u')}}
+// NOT YET IMPLEMENTED RWBuffer b : register(t2, space1);
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'u' for register type 
'Texture1D' (expected 't')}}
+// NOT YET IMPLEMENTED Texture1D tex : register(u3);
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 's' for register type 
'Texture2D' (expected 't')}}
+// NOT YET IMPLEMENTED Texture2D Texture : register(s0);
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'u' for register type 
'Texture2DMS' (expected 't')}}
+// NOT YET IMPLEMENTED Texture2DMS T2DMS_t2 : register(u2)
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 't' for register type 
'RWTexture3D' (expected 'u')}}
+// NOT YET IMPLEMENTED RWTexture3D RWT3D_u1 : register(t1)
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'b' for register type 
'Texture2DMS' (expected 't' or 's')}}
+// NOT YET IMPLEMENTED TextureCube TCube_b2 : register(B2);

tex3d wrote:

Same with all other `(expected 't' or 's')` below.

https://github.com/llvm/llvm-project/pull/87578
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Implement resource binding type prefix mismatch errors (PR #87578)

2024-04-23 Thread Tex Riddell via cfe-commits


@@ -0,0 +1,74 @@
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -x hlsl -o - 
-fsyntax-only %s -verify
+
+// the below will cause an llvm unreachable, because RWBuffers don't have 
resource attributes yet
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'b' for register type 
'RWBuffer' (expected 'u')}}
+// NOT YET IMPLEMENTED RWBuffer a : register(b2, space1);
+
+// the below will cause an llvm unreachable, because RWBuffers don't have 
resource attributes yet
+// NOT YET IMPLEMENTED : {{invalid register name prefix 't' for register type 
'RWBuffer' (expected 'u')}}
+// NOT YET IMPLEMENTED RWBuffer b : register(t2, space1);
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'u' for register type 
'Texture1D' (expected 't')}}
+// NOT YET IMPLEMENTED Texture1D tex : register(u3);
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 's' for register type 
'Texture2D' (expected 't')}}
+// NOT YET IMPLEMENTED Texture2D Texture : register(s0);
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'u' for register type 
'Texture2DMS' (expected 't')}}
+// NOT YET IMPLEMENTED Texture2DMS T2DMS_t2 : register(u2)
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 't' for register type 
'RWTexture3D' (expected 'u')}}
+// NOT YET IMPLEMENTED RWTexture3D RWT3D_u1 : register(t1)
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'b' for register type 
'Texture2DMS' (expected 't' or 's')}}
+// NOT YET IMPLEMENTED TextureCube TCube_b2 : register(B2);
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'b' for register type 
'Texture2DMS' (expected 't')}}
+// NOT YET IMPLEMENTED TextureCubeArray TCubeArray_t2 : register(b2);
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'b' for register type 
'Texture2DMS' (expected 't' or 's')}}
+// NOT YET IMPLEMENTED Texture1DArray T1DArray_t2 : register(b2);
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'u' for register type 
'Texture2DMS' (expected 't' or 's')}}
+// NOT YET IMPLEMENTED Texture2DArray T2DArray_b2 : register(B2);
+
+// NOT YET IMPLEMENTED : {{invalid register name prefix 'u' for register type 
'Texture2DMS' (expected 'b' or 'c' or 'i')}}
+// NOT YET IMPLEMENTED Texture2DMSArray msTextureArray : register(t2, 
space2);

tex3d wrote:

Expected would be 't', not `'b' or 'c' or 'i'`.  In fact, that group ('b', 'c', 
'i') were legacy constant register bindings for DX9, which is where I suspect 
this comes from, where 'b' meant a special bool constant and 'i' mapped to 
special loop constant values (used in DX9 shader models), and 'c' was a float 
constant value.  'b' was since used for constant buffer binding location as 
well, so there is a bit of a collision, but not on the same variable type, 
since the 'b', 'c', 'i' bindings were on global numeric variables, not on 
resource objects or cbuffer/tbuffer declarations.

So, we no longer support the DX9 'b','c','i' numeric constant bindings the same 
way, but we do reuse 'c' for manual cbuffer layout, and 'b' for cbuffer or 
ConstantBuffer binding.

https://github.com/llvm/llvm-project/pull/87578
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Implement resource binding type prefix mismatch errors (PR #87578)

2024-04-29 Thread Tex Riddell via cfe-commits


@@ -44,7 +44,7 @@ void foo2() {
   // expected-warning@+1 {{'register' attribute only applies to 
cbuffer/tbuffer and external global variables}}
   extern RWBuffer U2 : register(u5);
 }
-// FIXME: expect-error once fix 
https://github.com/llvm/llvm-project/issues/57886.
+// expected-error@+1 {{invalid register name prefix 'u' for 'float' (expected 
't')}}

tex3d wrote:

We should consider deprecating the register binding of this style: `float b : 
register(c0);`, since it's not supported by DXC, and was only supported by FXC 
for DX9 targets.  Register binding should only be applicable to global 
resource/sampler declarations and cbuffer/tbuffer declarations.  So the answer 
to "what prefix should that be" is: we don't support register bindings on 
values that go into the constant buffer, so there is no valid prefix to use 
here.  The closest equivalent to these legacy register bindings that we do 
support are `packoffset` annotations.

https://github.com/llvm/llvm-project/pull/87578
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] Shore up floating point conversions (PR #90222)

2024-05-01 Thread Tex Riddell via cfe-commits


@@ -0,0 +1,229 @@
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -fnative-half-type 
-finclude-default-header -Wconversion -verify -o - %s
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.3-library -fnative-half-type 
-finclude-default-header -ast-dump %s | FileCheck %s
+
+// This test verifies floating point type implicit conversion ranks for 
overload
+// resolution. In HLSL the built-in type ranks are half < float < double. This
+// applies to both scalar and vector types.
+
+// HLSL allows implicit truncation fo types, so it differentiates between
+// promotions (converting to larger types) and conversions (converting to
+// smaller types). Promotions are preferred over conversions. Promotions prefer
+// promoting to the next lowest type in the ranking order. Conversions prefer
+// converting to the next highest type in the ranking order.
+
+void HalfFloatDouble(double D);
+void HalfFloatDouble(float F);
+void HalfFloatDouble(half H);
+
+// CHECK: FunctionDecl {{.*}} used HalfFloatDouble 'void (double)'
+// CHECK: FunctionDecl {{.*}} used HalfFloatDouble 'void (float)'
+// CHECK: FunctionDecl {{.*}} used HalfFloatDouble 'void (half)'
+
+void FloatDouble(double D);
+void FloatDouble(float F);
+
+// CHECK: FunctionDecl {{.*}} used FloatDouble 'void (double)'
+// CHECK: FunctionDecl {{.*}} used FloatDouble 'void (float)'
+
+void HalfDouble(double D);
+void HalfDouble(half H);
+
+// CHECK: FunctionDecl {{.*}} used HalfDouble 'void (double)'
+// CHECK: FunctionDecl {{.*}} used HalfDouble 'void (half)'
+
+void HalfFloat(float F);
+void HalfFloat(half H);
+
+// CHECK: FunctionDecl {{.*}} used HalfFloat 'void (float)'
+// CHECK: FunctionDecl {{.*}} used HalfFloat 'void (half)'
+
+void Double(double D);
+void Float(float F);
+void Half(half H);
+
+// CHECK: FunctionDecl {{.*}} used Double 'void (double)'
+// CHECK: FunctionDecl {{.*}} used Float 'void (float)'
+// CHECK: FunctionDecl {{.*}} used Half 'void (half)'
+
+
+// Case 1: A function declared with overloads for half float and double types.
+//   (a) When called with half, it will resolve to half because half is an 
exact
+//   match.
+//   (b) When called with float it will resolve to float because float is an
+//   exact match.
+//   (c) When called with double it will resolve to double because it is an
+//   exact match.
+
+// CHECK: FunctionDecl {{.*}} Case1 'void (half, float, double)'

tex3d wrote:

Would using `CHECK-LABEL` be a bit better for these `Case*` checks?  I usually 
find it better for this purpose, since it prevents checks from before the label 
from erroneously matching content after the label which can make the subsequent 
error reporting harder to deal with.  But perhaps you can't use it due to 
restrictions on using regex within `CHECK-LABEL`, which IIRC was an issue?

https://github.com/llvm/llvm-project/pull/90222
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] Shore up floating point conversions (PR #90222)

2024-05-01 Thread Tex Riddell via cfe-commits

https://github.com/tex3d approved this pull request.


https://github.com/llvm/llvm-project/pull/90222
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [clang][hlsl] Add atan2 intrinsic part 1 (PR #107923)

2024-09-13 Thread Tex Riddell via cfe-commits

https://github.com/tex3d updated 
https://github.com/llvm/llvm-project/pull/107923

>From 44b355687a3e148bfe3d5e4f95efd39363b58b07 Mon Sep 17 00:00:00 2001
From: Tex Riddell 
Date: Mon, 9 Sep 2024 14:39:18 -0700
Subject: [PATCH 1/3] [clang][hlsl] Add atan2 intrinsic part 1

Issue: #70096

Changes:
- Doc updates:
  - `clang/docs/LanguageExtensions.rst` - Document the new elementwise atan2 
builtin.
  - `llvm/docs/LangRef.rst` - Document the atan2 intrinsic
- TableGen:
  - `clang/include/clang/Basic/Builtins.td` - Implement the atan2 builtin.
  - `llvm/include/llvm/IR/Intrinsics.td` - Create the atan2 intrinsic
- Sema checking:
  - `clang/lib/Sema/SemaChecking.cpp` - Add generic sema checks to the atan2 
builtin
  - `clang/lib/Sema/SemaHLSL` Add HLSL specifc sema checks to the atan2 builtin
- `clang/lib/CodeGen/CGBuiltin.cpp` - invoke the atan2 intrinsic on uses of the 
builtin
- `clang/lib/Headers/hlsl/hlsl_intrinsics.h` - Associate the atan2 builtin with 
the equivalent hlsl apis
---
 clang/docs/LanguageExtensions.rst |  1 +
 clang/include/clang/Basic/Builtins.td |  6 ++
 clang/lib/CodeGen/CGBuiltin.cpp   |  3 +
 clang/lib/Headers/hlsl/hlsl_intrinsics.h  | 30 ++
 clang/lib/Sema/SemaChecking.cpp   |  1 +
 clang/lib/Sema/SemaHLSL.cpp   |  1 +
 .../test/CodeGen/builtins-elementwise-math.c  | 20 +++
 .../CodeGen/strictfp-elementwise-bulitins.cpp | 10 
 clang/test/CodeGenHLSL/builtins/atan2.hlsl| 59 +++
 clang/test/Sema/aarch64-sve-vector-trig-ops.c |  6 ++
 clang/test/Sema/builtins-elementwise-math.c   | 24 
 clang/test/Sema/riscv-rvv-vector-trig-ops.c   |  6 ++
 .../SemaCXX/builtins-elementwise-math.cpp |  7 +++
 .../BuiltIns/half-float-only-errors2.hlsl |  7 +++
 llvm/docs/LangRef.rst | 37 
 llvm/include/llvm/IR/Intrinsics.td|  1 +
 16 files changed, 219 insertions(+)
 create mode 100644 clang/test/CodeGenHLSL/builtins/atan2.hlsl
 create mode 100644 clang/test/SemaHLSL/BuiltIns/half-float-only-errors2.hlsl

diff --git a/clang/docs/LanguageExtensions.rst 
b/clang/docs/LanguageExtensions.rst
index c08697282cbfe8..dd4a14e88394e9 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -660,6 +660,7 @@ Unless specified otherwise operation(±0) = ±0 and 
operation(±infinity) = ±in
  T __builtin_elementwise_asin(T x)   return the arcsine of x 
interpreted as an angle in radians   floating point types
  T __builtin_elementwise_acos(T x)   return the arccosine of x 
interpreted as an angle in radians floating point types
  T __builtin_elementwise_atan(T x)   return the arctangent of x 
interpreted as an angle in radiansfloating point types
+ T __builtin_elementwise_atan2(T y, T x) return the arctangent of y/x  
   floating point types
  T __builtin_elementwise_sinh(T x)   return the hyperbolic sine of 
angle x in radians floating point types
  T __builtin_elementwise_cosh(T x)   return the hyperbolic cosine of 
angle x in radians   floating point types
  T __builtin_elementwise_tanh(T x)   return the hyperbolic tangent of 
angle x in radians  floating point types
diff --git a/clang/include/clang/Basic/Builtins.td 
b/clang/include/clang/Basic/Builtins.td
index d9833b6559eab3..38f3083348b4dd 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -1250,6 +1250,12 @@ def ElementwiseATan : Builtin {
   let Prototype = "void(...)";
 }
 
+def ElementwiseATan2 : Builtin {
+  let Spellings = ["__builtin_elementwise_atan2"];
+  let Attributes = [NoThrow, Const, CustomTypeChecking];
+  let Prototype = "void(...)";
+}
+
 def ElementwiseBitreverse : Builtin {
   let Spellings = ["__builtin_elementwise_bitreverse"];
   let Attributes = [NoThrow, Const, CustomTypeChecking];
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 0078ceb7e892af..94e6448c7754e7 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -3800,6 +3800,9 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl 
GD, unsigned BuiltinID,
   case Builtin::BI__builtin_elementwise_atan:
 return RValue::get(emitBuiltinWithOneOverloadedType<1>(
 *this, E, llvm::Intrinsic::atan, "elt.atan"));
+  case Builtin::BI__builtin_elementwise_atan2:
+return RValue::get(emitBuiltinWithOneOverloadedType<2>(
+*this, E, llvm::Intrinsic::atan2, "elt.atan2"));
   case Builtin::BI__builtin_elementwise_ceil:
 return RValue::get(emitBuiltinWithOneOverloadedType<1>(
 *this, E, llvm::Intrinsic::ceil, "elt.ceil"));
diff --git a/clang/lib/Headers/hlsl/hlsl_intrinsics.h 
b/clang/lib/Headers/hlsl/hlsl_intrinsics.h
index 2ac18056b0fc3d..e80ff2c00d9b50 100644
--- a/clang/lib/Headers/hlsl/hlsl_intrinsics.h
+++ b/clang/lib/Head

[clang] [llvm] [clang][hlsl] Add atan2 intrinsic part 1 (PR #107923)

2024-09-16 Thread Tex Riddell via cfe-commits

tex3d wrote:

Abandoning this PR in favor of a better sequence of steps starting with #108865.

https://github.com/llvm/llvm-project/pull/107923
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [clang][hlsl] Add atan2 intrinsic part 1 (PR #107923)

2024-09-16 Thread Tex Riddell via cfe-commits

https://github.com/tex3d closed https://github.com/llvm/llvm-project/pull/107923
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Add __builtin_(elementwise|reduce)_(max|min)imum (PR #110198)

2024-10-01 Thread Tex Riddell via cfe-commits

tex3d wrote:

Test fix has re-landed: fea18afeed39fe4435d67eee1834f0f34b23013d.

https://github.com/llvm/llvm-project/pull/110198
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] fea18af - Fix failing test caused by b70d327

2024-10-01 Thread Tex Riddell via cfe-commits

Author: Tex Riddell
Date: 2024-10-01T18:26:05-07:00
New Revision: fea18afeed39fe4435d67eee1834f0f34b23013d

URL: 
https://github.com/llvm/llvm-project/commit/fea18afeed39fe4435d67eee1834f0f34b23013d
DIFF: 
https://github.com/llvm/llvm-project/commit/fea18afeed39fe4435d67eee1834f0f34b23013d.diff

LOG: Fix failing test caused by b70d327

`clang/test/Sema/aarch64-sve-vector-trig-ops.c` wasn't updated when merging PR 
#110187, which changed the expected diagnostics for the atan2 test.

Added: 


Modified: 
clang/test/Sema/aarch64-sve-vector-trig-ops.c

Removed: 




diff  --git a/clang/test/Sema/aarch64-sve-vector-trig-ops.c 
b/clang/test/Sema/aarch64-sve-vector-trig-ops.c
index 31f608bf151099..3fe6834be2e0b7 100644
--- a/clang/test/Sema/aarch64-sve-vector-trig-ops.c
+++ b/clang/test/Sema/aarch64-sve-vector-trig-ops.c
@@ -25,7 +25,7 @@ svfloat32_t test_atan_vv_i8mf8(svfloat32_t v) {
 svfloat32_t test_atan2_vv_i8mf8(svfloat32_t v) {
 
   return __builtin_elementwise_atan2(v, v);
-  // expected-error@-1 {{1st argument must be a vector, integer or floating 
point type}}
+  // expected-error@-1 {{1st argument must be a floating point type}}
 }
 
 svfloat32_t test_sin_vv_i8mf8(svfloat32_t v) {



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Add __builtin_(elementwise|reduce)_(max|min)imum (PR #110198)

2024-10-01 Thread Tex Riddell via cfe-commits

tex3d wrote:

Unfortunately, I didn't see your revert, and my pull didn't show any changes 
from main before I quickly committed the test change.  So my test change would 
also break the build with your revert.  So I reverted the test change just now! 
Revert: 5d308af894ccc3f7a288d6abd6f9097b4cbc8cf4.

https://github.com/llvm/llvm-project/pull/110198
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Add __builtin_(elementwise|reduce)_(max|min)imum (PR #110198)

2024-10-01 Thread Tex Riddell via cfe-commits

tex3d wrote:

> But IIUC, the diagnostic message change from this PR might be unintentional. 
> The expected error message in 
> [793ded7](https://github.com/llvm/llvm-project/commit/793ded7d0b7f1407636a98007f83074b8dd5f765)
>  doesn't align to the error message from other tests in the same file. Should 
> that be addressed?

The other error messages in the file are now different because the path taken 
for the one arg intrinsic is different (calling 
`PrepareBuiltinElementwiseMathOneArgCall`, which calls 
`checkMathBuiltinElementType`, which emits this error).

It would be nice to fix these somehow, but I don't know if that should be part 
of landing this PR.

I didn't like that wording in the diagnostic for multiple reasons (vector, 
integer or floating point??), but there wasn't a clear fix that wouldn't have 
broader impact, so I avoided rocking the boat here.  I think it would be great 
to have a cleaner diagnostic and a better function structure for this code, but 
that probably makes more sense as a separate cleanup step at this point.

Honestly, this code makes me wish there was a more regular structure to 
defining and applying needed constraints here.  It currently feels a bit 
convoluted and messy.

What do you think?

https://github.com/llvm/llvm-project/pull/110198
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL] Implement `WaveReadLaneAt` intrinsic (PR #111010)

2024-10-07 Thread Tex Riddell via cfe-commits


@@ -2015,6 +2015,13 @@ _HLSL_AVAILABILITY(shadermodel, 6.0)
 _HLSL_BUILTIN_ALIAS(__builtin_hlsl_wave_is_first_lane)
 __attribute__((convergent)) bool WaveIsFirstLane();
 
+// \brief Returns the value of the expression for the given lane index within
+// the specified wave.
+template 
+_HLSL_AVAILABILITY(shadermodel, 6.0)
+_HLSL_BUILTIN_ALIAS(__builtin_hlsl_wave_read_lane_at)
+__attribute__((convergent)) T WaveReadLaneAt(T, int32_t);

tex3d wrote:

`any<>` means scalar, vector, or matrix, with any component type valid with 
those (bool and basic numeric types).

https://github.com/llvm/llvm-project/pull/111010
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-18 Thread Tex Riddell via cfe-commits


@@ -18952,6 +18955,142 @@ case Builtin::BI__builtin_hlsl_elementwise_isinf: {
 CGM.getHLSLRuntime().getRadiansIntrinsic(), ArrayRef{Op0},
 nullptr, "hlsl.radians");
   }
+  case Builtin::BI__builtin_hlsl_splitdouble: {

tex3d wrote:

This is quite a lot of code in this case of the switch statement, perhaps we 
could break this out into a static function?

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-18 Thread Tex Riddell via cfe-commits


@@ -18952,6 +18955,142 @@ case Builtin::BI__builtin_hlsl_elementwise_isinf: {
 CGM.getHLSLRuntime().getRadiansIntrinsic(), ArrayRef{Op0},
 nullptr, "hlsl.radians");
   }
+  case Builtin::BI__builtin_hlsl_splitdouble: {
+
+assert((E->getArg(0)->getType()->hasFloatingRepresentation() &&
+E->getArg(1)->getType()->hasUnsignedIntegerRepresentation() &&
+E->getArg(2)->getType()->hasUnsignedIntegerRepresentation()) &&
+   "asuint operands types mismatch");
+Value *Op0 = EmitScalarExpr(E->getArg(0));
+const auto *OutArg1 = dyn_cast(E->getArg(1));
+const auto *OutArg2 = dyn_cast(E->getArg(2));
+
+CallArgList Args;
+LValue Op1TmpLValue = EmitHLSLOutArgExpr(OutArg1, Args, 
OutArg1->getType());
+LValue Op2TmpLValue = EmitHLSLOutArgExpr(OutArg2, Args, 
OutArg2->getType());
+
+if (getTarget().getCXXABI().areArgsDestroyedLeftToRightInCallee())
+  Args.reverseWritebacks();
+
+auto EmitVectorCode =
+[](Value *Op, CGBuilderTy *Builder,
+   FixedVectorType *DestTy) -> std::pair {
+  Value *bitcast = Builder->CreateBitCast(Op, DestTy);
+
+  SmallVector LowbitsIndex;
+  SmallVector HighbitsIndex;
+
+  for (unsigned int Idx = 0; Idx < DestTy->getNumElements(); Idx += 2) {
+LowbitsIndex.push_back(Idx);
+HighbitsIndex.push_back(Idx + 1);
+  }
+
+  Value *Arg0 = Builder->CreateShuffleVector(bitcast, LowbitsIndex);
+  Value *Arg1 = Builder->CreateShuffleVector(bitcast, HighbitsIndex);
+
+  return std::make_pair(Arg0, Arg1);
+};
+
+Value *LastInst = nullptr;
+
+if (CGM.getTarget().getTriple().isDXIL()) {
+
+  llvm::Type *RetElementTy = Int32Ty;
+  if (auto *Op0VecTy = E->getArg(0)->getType()->getAs())
+RetElementTy = llvm::VectorType::get(
+Int32Ty, ElementCount::getFixed(Op0VecTy->getNumElements()));
+  auto *RetTy = llvm::StructType::get(RetElementTy, RetElementTy);
+
+  CallInst *CI = Builder.CreateIntrinsic(
+  RetTy, Intrinsic::dx_splitdouble, {Op0}, nullptr, 
"hlsl.splitdouble");
+
+  Value *Arg0 = Builder.CreateExtractValue(CI, 0);
+  Value *Arg1 = Builder.CreateExtractValue(CI, 1);
+
+  Builder.CreateStore(Arg0, Op1TmpLValue.getAddress());
+  LastInst = Builder.CreateStore(Arg1, Op2TmpLValue.getAddress());
+
+} else {
+
+  assert(!CGM.getTarget().getTriple().isDXIL() &&
+ "For non-DXIL targets we generate the instructions");
+
+  if (!Op0->getType()->isVectorTy()) {
+FixedVectorType *DestTy = FixedVectorType::get(Int32Ty, 2);
+Value *Bitcast = Builder.CreateBitCast(Op0, DestTy);
+
+Value *Arg0 = Builder.CreateExtractElement(Bitcast, 0.0);
+Value *Arg1 = Builder.CreateExtractElement(Bitcast, 1.0);
+
+Builder.CreateStore(Arg0, Op1TmpLValue.getAddress());
+LastInst = Builder.CreateStore(Arg1, Op2TmpLValue.getAddress());
+  } else {
+
+const auto *TargTy = E->getArg(0)->getType()->getAs();
+
+int NumElements = TargTy->getNumElements();
+
+FixedVectorType *DestTy = FixedVectorType::get(Int32Ty, 4);
+if (NumElements == 1) {
+  FixedVectorType *DestTy = FixedVectorType::get(Int32Ty, 2);
+  Value *Bitcast = Builder.CreateBitCast(Op0, DestTy);
+
+  Value *Arg0 = Builder.CreateExtractElement(Bitcast, 0.0);
+  Value *Arg1 = Builder.CreateExtractElement(Bitcast, 1.0);
+
+  Builder.CreateStore(Arg0, Op1TmpLValue.getAddress());
+  LastInst = Builder.CreateStore(Arg1, Op2TmpLValue.getAddress());
+} else if (NumElements == 2) {
+  auto [LowBits, HighBits] = EmitVectorCode(Op0, &Builder, DestTy);
+
+  Builder.CreateStore(LowBits, Op1TmpLValue.getAddress());
+  LastInst = Builder.CreateStore(HighBits, Op2TmpLValue.getAddress());
+} else {
+
+  SmallVector> EmitedValuePairs;
+
+  for (int It = 0; It < NumElements; It += 2) {
+// Due to existing restrictions to SPIR-V and splitdouble,
+// all shufflevector operations, should return vectors of
+// the same size, up to 4. Such introduce and edge case

tex3d wrote:

I don't understand why SPIR-V vector width/shuffle restrictions are being 
applied during builtin codegen.  Shouldn't constraints be applied elsewhere 
when necessary?  I would have expected the SPIR-V path to be much simpler here.

Most elementwise HLSL intrinsics must support matrices, which should map to 
large vectors.  Shuffles would be used in various cases on those as well.  I 
would think we need an approach that can handle arbitrary, legal llvm 
vector/shuffle code then transform and constrain these later for SPIR-V 
lowering.

Also `!DXIL` could mean more than just SPIR-V at some point, right?

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lis

[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-18 Thread Tex Riddell via cfe-commits


@@ -4871,6 +4871,12 @@ def HLSLRadians : LangBuiltin<"HLSL_LANG"> {
   let Prototype = "void(...)";
 }
 
+def HLSLSplitDouble: LangBuiltin<"HLSL_LANG"> {
+  let Spellings = ["__builtin_hlsl_splitdouble"];

tex3d wrote:

Are we inconsistent on the use of "elementwise"?  Shouldn't we have that here?

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-18 Thread Tex Riddell via cfe-commits


@@ -4681,6 +4601,87 @@ void CallArg::copyInto(CodeGenFunction &CGF, Address 
Addr) const {
   IsUsed = true;
 }
 
+/// Emit the actual writing-back of a writeback.
+void CodeGenFunction::EmitWriteback(CodeGenFunction &CGF,

tex3d wrote:

Moving this function makes it hard to tell if it's been modified at the same 
time.

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-18 Thread Tex Riddell via cfe-commits


@@ -5149,6 +5152,12 @@ class CodeGenFunction : public CodeGenTypeCache {
SourceLocation ArgLoc, AbstractCallee AC,
unsigned ParmNum);
 
+  /// EmitWriteback - Emit callbacks for function.
+  void EmitWritebacks(CodeGenFunction &CGF, const CallArgList &Args);
+
+  void EmitWriteback(CodeGenFunction &CGF,
+ const CallArgList::Writeback &writeback);

tex3d wrote:

Was it necessary to expose `EmitWriteback`?  It seems you would only need 
`EmitWritebacks`.

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-18 Thread Tex Riddell via cfe-commits


@@ -2074,6 +2083,35 @@ bool SemaHLSL::CheckBuiltinFunctionCall(unsigned 
BuiltinID, CallExpr *TheCall) {
   return true;
 break;
   }
+  case Builtin::BI__builtin_hlsl_splitdouble: {
+if (SemaRef.checkArgCount(TheCall, 3))
+  return true;
+
+Expr *Op0 = TheCall->getArg(0);
+
+auto CheckIsNotDouble = [](clang::QualType PassedType) -> bool {

tex3d wrote:

Naming convention for `CheckIsNotDouble` seems reversed from what I'd expect 
based on some of the established patterns, since double is the desired type.  
See similar functions `CheckAllArgsHaveFloatRepresentation`, 
`CheckFloatOrHalfRepresentations`, `CheckNoDoubleVectors`, 
`CheckFloatingOrIntRepresentation`, etc...

Note: same comment for function `CheckIsNotUint` below.

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-18 Thread Tex Riddell via cfe-commits


@@ -18952,6 +18955,142 @@ case Builtin::BI__builtin_hlsl_elementwise_isinf: {
 CGM.getHLSLRuntime().getRadiansIntrinsic(), ArrayRef{Op0},
 nullptr, "hlsl.radians");
   }
+  case Builtin::BI__builtin_hlsl_splitdouble: {
+
+assert((E->getArg(0)->getType()->hasFloatingRepresentation() &&
+E->getArg(1)->getType()->hasUnsignedIntegerRepresentation() &&
+E->getArg(2)->getType()->hasUnsignedIntegerRepresentation()) &&
+   "asuint operands types mismatch");
+Value *Op0 = EmitScalarExpr(E->getArg(0));
+const auto *OutArg1 = dyn_cast(E->getArg(1));
+const auto *OutArg2 = dyn_cast(E->getArg(2));
+
+CallArgList Args;
+LValue Op1TmpLValue = EmitHLSLOutArgExpr(OutArg1, Args, 
OutArg1->getType());
+LValue Op2TmpLValue = EmitHLSLOutArgExpr(OutArg2, Args, 
OutArg2->getType());
+
+if (getTarget().getCXXABI().areArgsDestroyedLeftToRightInCallee())
+  Args.reverseWritebacks();
+
+auto EmitVectorCode =
+[](Value *Op, CGBuilderTy *Builder,
+   FixedVectorType *DestTy) -> std::pair {
+  Value *bitcast = Builder->CreateBitCast(Op, DestTy);
+
+  SmallVector LowbitsIndex;
+  SmallVector HighbitsIndex;
+
+  for (unsigned int Idx = 0; Idx < DestTy->getNumElements(); Idx += 2) {
+LowbitsIndex.push_back(Idx);
+HighbitsIndex.push_back(Idx + 1);
+  }
+
+  Value *Arg0 = Builder->CreateShuffleVector(bitcast, LowbitsIndex);
+  Value *Arg1 = Builder->CreateShuffleVector(bitcast, HighbitsIndex);
+
+  return std::make_pair(Arg0, Arg1);
+};
+
+Value *LastInst = nullptr;
+
+if (CGM.getTarget().getTriple().isDXIL()) {
+
+  llvm::Type *RetElementTy = Int32Ty;
+  if (auto *Op0VecTy = E->getArg(0)->getType()->getAs())
+RetElementTy = llvm::VectorType::get(
+Int32Ty, ElementCount::getFixed(Op0VecTy->getNumElements()));
+  auto *RetTy = llvm::StructType::get(RetElementTy, RetElementTy);
+
+  CallInst *CI = Builder.CreateIntrinsic(
+  RetTy, Intrinsic::dx_splitdouble, {Op0}, nullptr, 
"hlsl.splitdouble");
+
+  Value *Arg0 = Builder.CreateExtractValue(CI, 0);
+  Value *Arg1 = Builder.CreateExtractValue(CI, 1);
+
+  Builder.CreateStore(Arg0, Op1TmpLValue.getAddress());
+  LastInst = Builder.CreateStore(Arg1, Op2TmpLValue.getAddress());
+
+} else {
+
+  assert(!CGM.getTarget().getTriple().isDXIL() &&
+ "For non-DXIL targets we generate the instructions");

tex3d wrote:

No need to assert the negation of the if condition in the else block, right?

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-18 Thread Tex Riddell via cfe-commits


@@ -438,6 +438,24 @@ template  constexpr uint asuint(T F) {
   return __detail::bit_cast(F);
 }
 
+//===--===//
+// asuint splitdouble builtins
+//===--===//
+
+/// \fn void asuint(double D, out uint lowbits, out int highbits)
+/// \brief Split and interprets the lowbits and highbits of double D into 
uints.
+/// \param D The input double.
+/// \param lowbits The output lowbits of D.
+/// \param highbits The highbits lowbits D.

tex3d wrote:

```suggestion
/// \param highbits The output highbits D.
```

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-18 Thread Tex Riddell via cfe-commits


@@ -0,0 +1,50 @@
+; RUN: llc -verify-machineinstrs -O0 -mtriple=spirv-unknown-unknown %s -o - | 
FileCheck %s
+; RUN: %if spirv-tools %{ llc -O0 -mtriple=spirv-unknown-unknown %s -o - 
-filetype=obj | spirv-val %}
+
+; Make sure lowering is correctly generating spirv code.

tex3d wrote:

This just tests SPIR-V lowering of the code you expect to generate from the 
HLSL.  It doesn't actually test that the `CGBuiltin.cpp` code generates this IR 
for the SPIR-V path.  I think we are missing a test for that.

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-18 Thread Tex Riddell via cfe-commits


@@ -18952,6 +18955,142 @@ case Builtin::BI__builtin_hlsl_elementwise_isinf: {
 CGM.getHLSLRuntime().getRadiansIntrinsic(), ArrayRef{Op0},
 nullptr, "hlsl.radians");
   }
+  case Builtin::BI__builtin_hlsl_splitdouble: {
+
+assert((E->getArg(0)->getType()->hasFloatingRepresentation() &&
+E->getArg(1)->getType()->hasUnsignedIntegerRepresentation() &&
+E->getArg(2)->getType()->hasUnsignedIntegerRepresentation()) &&
+   "asuint operands types mismatch");
+Value *Op0 = EmitScalarExpr(E->getArg(0));
+const auto *OutArg1 = dyn_cast(E->getArg(1));
+const auto *OutArg2 = dyn_cast(E->getArg(2));
+
+CallArgList Args;
+LValue Op1TmpLValue = EmitHLSLOutArgExpr(OutArg1, Args, 
OutArg1->getType());
+LValue Op2TmpLValue = EmitHLSLOutArgExpr(OutArg2, Args, 
OutArg2->getType());
+
+if (getTarget().getCXXABI().areArgsDestroyedLeftToRightInCallee())
+  Args.reverseWritebacks();
+
+auto EmitVectorCode =
+[](Value *Op, CGBuilderTy *Builder,
+   FixedVectorType *DestTy) -> std::pair {
+  Value *bitcast = Builder->CreateBitCast(Op, DestTy);
+
+  SmallVector LowbitsIndex;
+  SmallVector HighbitsIndex;
+
+  for (unsigned int Idx = 0; Idx < DestTy->getNumElements(); Idx += 2) {
+LowbitsIndex.push_back(Idx);
+HighbitsIndex.push_back(Idx + 1);
+  }
+
+  Value *Arg0 = Builder->CreateShuffleVector(bitcast, LowbitsIndex);
+  Value *Arg1 = Builder->CreateShuffleVector(bitcast, HighbitsIndex);
+
+  return std::make_pair(Arg0, Arg1);
+};
+
+Value *LastInst = nullptr;
+
+if (CGM.getTarget().getTriple().isDXIL()) {
+
+  llvm::Type *RetElementTy = Int32Ty;
+  if (auto *Op0VecTy = E->getArg(0)->getType()->getAs())
+RetElementTy = llvm::VectorType::get(
+Int32Ty, ElementCount::getFixed(Op0VecTy->getNumElements()));
+  auto *RetTy = llvm::StructType::get(RetElementTy, RetElementTy);
+
+  CallInst *CI = Builder.CreateIntrinsic(
+  RetTy, Intrinsic::dx_splitdouble, {Op0}, nullptr, 
"hlsl.splitdouble");
+
+  Value *Arg0 = Builder.CreateExtractValue(CI, 0);
+  Value *Arg1 = Builder.CreateExtractValue(CI, 1);
+
+  Builder.CreateStore(Arg0, Op1TmpLValue.getAddress());
+  LastInst = Builder.CreateStore(Arg1, Op2TmpLValue.getAddress());
+
+} else {
+
+  assert(!CGM.getTarget().getTriple().isDXIL() &&
+ "For non-DXIL targets we generate the instructions");
+
+  if (!Op0->getType()->isVectorTy()) {
+FixedVectorType *DestTy = FixedVectorType::get(Int32Ty, 2);
+Value *Bitcast = Builder.CreateBitCast(Op0, DestTy);
+
+Value *Arg0 = Builder.CreateExtractElement(Bitcast, 0.0);
+Value *Arg1 = Builder.CreateExtractElement(Bitcast, 1.0);
+
+Builder.CreateStore(Arg0, Op1TmpLValue.getAddress());
+LastInst = Builder.CreateStore(Arg1, Op2TmpLValue.getAddress());
+  } else {
+
+const auto *TargTy = E->getArg(0)->getType()->getAs();
+
+int NumElements = TargTy->getNumElements();
+
+FixedVectorType *DestTy = FixedVectorType::get(Int32Ty, 4);
+if (NumElements == 1) {
+  FixedVectorType *DestTy = FixedVectorType::get(Int32Ty, 2);
+  Value *Bitcast = Builder.CreateBitCast(Op0, DestTy);
+
+  Value *Arg0 = Builder.CreateExtractElement(Bitcast, 0.0);
+  Value *Arg1 = Builder.CreateExtractElement(Bitcast, 1.0);
+
+  Builder.CreateStore(Arg0, Op1TmpLValue.getAddress());
+  LastInst = Builder.CreateStore(Arg1, Op2TmpLValue.getAddress());
+} else if (NumElements == 2) {
+  auto [LowBits, HighBits] = EmitVectorCode(Op0, &Builder, DestTy);
+
+  Builder.CreateStore(LowBits, Op1TmpLValue.getAddress());
+  LastInst = Builder.CreateStore(HighBits, Op2TmpLValue.getAddress());
+} else {
+
+  SmallVector> EmitedValuePairs;
+
+  for (int It = 0; It < NumElements; It += 2) {
+// Due to existing restrictions to SPIR-V and splitdouble,
+// all shufflevector operations, should return vectors of
+// the same size, up to 4. Such introduce and edge case

tex3d wrote:

I think this expansion (if we still have to do it here) would be better done a 
different way.

Instead of adding a dummy value to the original shuffle for the cast, add a 
shuffle to extend the casted result vector when needed, adding poison values 
instead of "dummy" values there.  This keeps the extra values localized, 
poison, and more easily eliminated, instead of passing through the bitcast.  
This can be done as part of generating the final high/low shuffles when the 
vector sizes don't match at the end.  That also keeps the logic localized and 
makes it more obvious why it's needed.

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
h

[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-18 Thread Tex Riddell via cfe-commits


@@ -0,0 +1,50 @@
+; RUN: llc -verify-machineinstrs -O0 -mtriple=spirv-unknown-unknown %s -o - | 
FileCheck %s
+; RUN: %if spirv-tools %{ llc -O0 -mtriple=spirv-unknown-unknown %s -o - 
-filetype=obj | spirv-val %}
+
+; Make sure lowering is correctly generating spirv code.

tex3d wrote:

Oh, nevermind, the test was under `CodeGenHLSL/builtins`.

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-18 Thread Tex Riddell via cfe-commits


@@ -1698,18 +1698,27 @@ static bool CheckVectorElementCallArgs(Sema *S, 
CallExpr *TheCall) {
   return true;
 }
 
-static bool CheckArgsTypesAreCorrect(
+bool CheckArgTypeIsIncorrect(

tex3d wrote:

This switches the naming pattern from the existing convention.  Why?  It makes 
me think: "Make sure the arg type is incorrect".  I know that the other 
functions return true on failure and false on success, but that's just the 
standard convention for sema checking functions.

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Add __builtin_(elementwise|reduce)_(max|min)imum (PR #110198)

2024-10-01 Thread Tex Riddell via cfe-commits

tex3d wrote:

@francisvm 
This changed diagnostics, but didn't update all the affected tests.

See: https://github.com/llvm/llvm-project/pull/110187#issuecomment-2387260975

https://github.com/llvm/llvm-project/pull/110198
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-22 Thread Tex Riddell via cfe-commits


@@ -95,6 +99,157 @@ static void initializeAlloca(CodeGenFunction &CGF, 
AllocaInst *AI, Value *Size,
   I->addAnnotationMetadata("auto-init");
 }
 
+static Value *handleHlslSplitdouble(const CallExpr *E, CodeGenFunction *CGF) {
+  Value *Op0 = CGF->EmitScalarExpr(E->getArg(0));
+  const auto *OutArg1 = dyn_cast(E->getArg(1));
+  const auto *OutArg2 = dyn_cast(E->getArg(2));
+
+  CallArgList Args;
+  LValue Op1TmpLValue =
+  CGF->EmitHLSLOutArgExpr(OutArg1, Args, OutArg1->getType());
+  LValue Op2TmpLValue =
+  CGF->EmitHLSLOutArgExpr(OutArg2, Args, OutArg2->getType());
+
+  if (CGF->getTarget().getCXXABI().areArgsDestroyedLeftToRightInCallee())
+Args.reverseWritebacks();
+
+  auto EmitVectorCode =
+  [](Value *Op, CGBuilderTy *Builder,
+ FixedVectorType *DestTy) -> std::pair {
+Value *bitcast = Builder->CreateBitCast(Op, DestTy);
+
+SmallVector LowbitsIndex;
+SmallVector HighbitsIndex;
+
+for (unsigned int Idx = 0; Idx < DestTy->getNumElements(); Idx += 2) {
+  LowbitsIndex.push_back(Idx);
+  HighbitsIndex.push_back(Idx + 1);
+}
+
+Value *Arg0 = Builder->CreateShuffleVector(bitcast, LowbitsIndex);
+Value *Arg1 = Builder->CreateShuffleVector(bitcast, HighbitsIndex);
+
+return std::make_pair(Arg0, Arg1);
+  };
+
+  Value *LastInst = nullptr;
+
+  if (CGF->CGM.getTarget().getTriple().isDXIL()) {
+
+llvm::Type *RetElementTy = CGF->Int32Ty;
+if (auto *Op0VecTy = E->getArg(0)->getType()->getAs())
+  RetElementTy = llvm::VectorType::get(
+  CGF->Int32Ty, ElementCount::getFixed(Op0VecTy->getNumElements()));
+auto *RetTy = llvm::StructType::get(RetElementTy, RetElementTy);
+
+CallInst *CI = CGF->Builder.CreateIntrinsic(
+RetTy, Intrinsic::dx_splitdouble, {Op0}, nullptr, "hlsl.splitdouble");
+
+Value *Arg0 = CGF->Builder.CreateExtractValue(CI, 0);
+Value *Arg1 = CGF->Builder.CreateExtractValue(CI, 1);
+
+CGF->Builder.CreateStore(Arg0, Op1TmpLValue.getAddress());
+LastInst = CGF->Builder.CreateStore(Arg1, Op2TmpLValue.getAddress());

tex3d wrote:

If `Arg0` and `Arg1` were declared in outer scope, this store code would be 
common to all branches, right?  Also, I think the names of `Arg0` and `Arg1` 
could better indicate the values, like: `LowBits` and `HighBits`.

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-22 Thread Tex Riddell via cfe-commits


@@ -4681,6 +4676,12 @@ void CallArg::copyInto(CodeGenFunction &CGF, Address 
Addr) const {
   IsUsed = true;
 }
 
+void CodeGenFunction::EmitWritebacks(CodeGenFunction &CGF,
+ const CallArgList &args) {
+  for (const auto &I : args.writebacks())
+emitWriteback(CGF, I);
+}

tex3d wrote:

Since this requires a reference to a CodeGenFunction, why not just make this a 
regular method instead of a static method?

In fact, after double-checking, it appears that it is a regular method, so why 
does it need a separate input argument when it already has `this`?

```suggestion
void CodeGenFunction::EmitWritebacks(const CallArgList &args) {
  for (const auto &I : args.writebacks())
emitWriteback(*this, I);
}
```

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-22 Thread Tex Riddell via cfe-commits


@@ -5149,6 +5152,9 @@ class CodeGenFunction : public CodeGenTypeCache {
SourceLocation ArgLoc, AbstractCallee AC,
unsigned ParmNum);
 
+  /// EmitWriteback - Emit callbacks for function.
+  void EmitWritebacks(CodeGenFunction &CGF, const CallArgList &Args);

tex3d wrote:

This isn't a static method, but it's signature and use make it look like you 
meant for it to be.  It seems like it could just be a regular method though.

```suggestion
  void EmitWritebacks(const CallArgList &Args);
```

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-22 Thread Tex Riddell via cfe-commits


@@ -18952,6 +18955,142 @@ case Builtin::BI__builtin_hlsl_elementwise_isinf: {
 CGM.getHLSLRuntime().getRadiansIntrinsic(), ArrayRef{Op0},
 nullptr, "hlsl.radians");
   }
+  case Builtin::BI__builtin_hlsl_splitdouble: {
+
+assert((E->getArg(0)->getType()->hasFloatingRepresentation() &&
+E->getArg(1)->getType()->hasUnsignedIntegerRepresentation() &&
+E->getArg(2)->getType()->hasUnsignedIntegerRepresentation()) &&
+   "asuint operands types mismatch");
+Value *Op0 = EmitScalarExpr(E->getArg(0));
+const auto *OutArg1 = dyn_cast(E->getArg(1));
+const auto *OutArg2 = dyn_cast(E->getArg(2));
+
+CallArgList Args;
+LValue Op1TmpLValue = EmitHLSLOutArgExpr(OutArg1, Args, 
OutArg1->getType());
+LValue Op2TmpLValue = EmitHLSLOutArgExpr(OutArg2, Args, 
OutArg2->getType());
+
+if (getTarget().getCXXABI().areArgsDestroyedLeftToRightInCallee())
+  Args.reverseWritebacks();
+
+auto EmitVectorCode =
+[](Value *Op, CGBuilderTy *Builder,
+   FixedVectorType *DestTy) -> std::pair {
+  Value *bitcast = Builder->CreateBitCast(Op, DestTy);
+
+  SmallVector LowbitsIndex;
+  SmallVector HighbitsIndex;
+
+  for (unsigned int Idx = 0; Idx < DestTy->getNumElements(); Idx += 2) {
+LowbitsIndex.push_back(Idx);
+HighbitsIndex.push_back(Idx + 1);
+  }
+
+  Value *Arg0 = Builder->CreateShuffleVector(bitcast, LowbitsIndex);
+  Value *Arg1 = Builder->CreateShuffleVector(bitcast, HighbitsIndex);
+
+  return std::make_pair(Arg0, Arg1);
+};
+
+Value *LastInst = nullptr;
+
+if (CGM.getTarget().getTriple().isDXIL()) {
+
+  llvm::Type *RetElementTy = Int32Ty;
+  if (auto *Op0VecTy = E->getArg(0)->getType()->getAs())
+RetElementTy = llvm::VectorType::get(
+Int32Ty, ElementCount::getFixed(Op0VecTy->getNumElements()));
+  auto *RetTy = llvm::StructType::get(RetElementTy, RetElementTy);
+
+  CallInst *CI = Builder.CreateIntrinsic(
+  RetTy, Intrinsic::dx_splitdouble, {Op0}, nullptr, 
"hlsl.splitdouble");
+
+  Value *Arg0 = Builder.CreateExtractValue(CI, 0);
+  Value *Arg1 = Builder.CreateExtractValue(CI, 1);
+
+  Builder.CreateStore(Arg0, Op1TmpLValue.getAddress());
+  LastInst = Builder.CreateStore(Arg1, Op2TmpLValue.getAddress());
+
+} else {
+
+  assert(!CGM.getTarget().getTriple().isDXIL() &&
+ "For non-DXIL targets we generate the instructions");

tex3d wrote:

I don't know what the code looked like when that comment was made, but it seems 
pointless to assert something that can't be false based on the local context.  
As in, this assert doesn't seem necessary or useful:  `if (a) {} else { 
assert(!a); }`  That assert is impossible to violate, unless you were to change 
the if condition.

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-22 Thread Tex Riddell via cfe-commits


@@ -18952,6 +18955,142 @@ case Builtin::BI__builtin_hlsl_elementwise_isinf: {
 CGM.getHLSLRuntime().getRadiansIntrinsic(), ArrayRef{Op0},
 nullptr, "hlsl.radians");
   }
+  case Builtin::BI__builtin_hlsl_splitdouble: {
+
+assert((E->getArg(0)->getType()->hasFloatingRepresentation() &&
+E->getArg(1)->getType()->hasUnsignedIntegerRepresentation() &&
+E->getArg(2)->getType()->hasUnsignedIntegerRepresentation()) &&
+   "asuint operands types mismatch");
+Value *Op0 = EmitScalarExpr(E->getArg(0));
+const auto *OutArg1 = dyn_cast(E->getArg(1));
+const auto *OutArg2 = dyn_cast(E->getArg(2));
+
+CallArgList Args;
+LValue Op1TmpLValue = EmitHLSLOutArgExpr(OutArg1, Args, 
OutArg1->getType());
+LValue Op2TmpLValue = EmitHLSLOutArgExpr(OutArg2, Args, 
OutArg2->getType());
+
+if (getTarget().getCXXABI().areArgsDestroyedLeftToRightInCallee())
+  Args.reverseWritebacks();
+
+auto EmitVectorCode =
+[](Value *Op, CGBuilderTy *Builder,
+   FixedVectorType *DestTy) -> std::pair {
+  Value *bitcast = Builder->CreateBitCast(Op, DestTy);
+
+  SmallVector LowbitsIndex;
+  SmallVector HighbitsIndex;
+
+  for (unsigned int Idx = 0; Idx < DestTy->getNumElements(); Idx += 2) {
+LowbitsIndex.push_back(Idx);
+HighbitsIndex.push_back(Idx + 1);
+  }
+
+  Value *Arg0 = Builder->CreateShuffleVector(bitcast, LowbitsIndex);
+  Value *Arg1 = Builder->CreateShuffleVector(bitcast, HighbitsIndex);
+
+  return std::make_pair(Arg0, Arg1);
+};
+
+Value *LastInst = nullptr;
+
+if (CGM.getTarget().getTriple().isDXIL()) {
+
+  llvm::Type *RetElementTy = Int32Ty;
+  if (auto *Op0VecTy = E->getArg(0)->getType()->getAs())
+RetElementTy = llvm::VectorType::get(
+Int32Ty, ElementCount::getFixed(Op0VecTy->getNumElements()));
+  auto *RetTy = llvm::StructType::get(RetElementTy, RetElementTy);
+
+  CallInst *CI = Builder.CreateIntrinsic(
+  RetTy, Intrinsic::dx_splitdouble, {Op0}, nullptr, 
"hlsl.splitdouble");
+
+  Value *Arg0 = Builder.CreateExtractValue(CI, 0);
+  Value *Arg1 = Builder.CreateExtractValue(CI, 1);
+
+  Builder.CreateStore(Arg0, Op1TmpLValue.getAddress());
+  LastInst = Builder.CreateStore(Arg1, Op2TmpLValue.getAddress());
+
+} else {
+
+  assert(!CGM.getTarget().getTriple().isDXIL() &&
+ "For non-DXIL targets we generate the instructions");
+
+  if (!Op0->getType()->isVectorTy()) {
+FixedVectorType *DestTy = FixedVectorType::get(Int32Ty, 2);
+Value *Bitcast = Builder.CreateBitCast(Op0, DestTy);
+
+Value *Arg0 = Builder.CreateExtractElement(Bitcast, 0.0);
+Value *Arg1 = Builder.CreateExtractElement(Bitcast, 1.0);
+
+Builder.CreateStore(Arg0, Op1TmpLValue.getAddress());
+LastInst = Builder.CreateStore(Arg1, Op2TmpLValue.getAddress());
+  } else {
+
+const auto *TargTy = E->getArg(0)->getType()->getAs();
+
+int NumElements = TargTy->getNumElements();
+
+FixedVectorType *DestTy = FixedVectorType::get(Int32Ty, 4);
+if (NumElements == 1) {
+  FixedVectorType *DestTy = FixedVectorType::get(Int32Ty, 2);
+  Value *Bitcast = Builder.CreateBitCast(Op0, DestTy);
+
+  Value *Arg0 = Builder.CreateExtractElement(Bitcast, 0.0);
+  Value *Arg1 = Builder.CreateExtractElement(Bitcast, 1.0);
+
+  Builder.CreateStore(Arg0, Op1TmpLValue.getAddress());
+  LastInst = Builder.CreateStore(Arg1, Op2TmpLValue.getAddress());
+} else if (NumElements == 2) {
+  auto [LowBits, HighBits] = EmitVectorCode(Op0, &Builder, DestTy);
+
+  Builder.CreateStore(LowBits, Op1TmpLValue.getAddress());
+  LastInst = Builder.CreateStore(HighBits, Op2TmpLValue.getAddress());
+} else {
+
+  SmallVector> EmitedValuePairs;
+
+  for (int It = 0; It < NumElements; It += 2) {
+// Due to existing restrictions to SPIR-V and splitdouble,
+// all shufflevector operations, should return vectors of
+// the same size, up to 4. Such introduce and edge case

tex3d wrote:

I still don't know where we stand on doing this kind of work in `CGBuilin.cpp` 
vs. splitting vectors for SPIR-V constraints in a more general way later.  I 
don't know why we are applying spirv constraints in builtin expansion.  We 
don't scalarize operations for DXIL in our builtin expansion code.
For one thing, this code path isn't technically limited to SPIR-V, so applying 
SPIR-V constraints here seems premature.

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-22 Thread Tex Riddell via cfe-commits


@@ -2074,6 +2083,35 @@ bool SemaHLSL::CheckBuiltinFunctionCall(unsigned 
BuiltinID, CallExpr *TheCall) {
   return true;
 break;
   }
+  case Builtin::BI__builtin_hlsl_elementwise_splitdouble: {
+if (SemaRef.checkArgCount(TheCall, 3))
+  return true;
+
+Expr *Op0 = TheCall->getArg(0);
+
+auto CheckIsDouble = [](clang::QualType PassedType) -> bool {
+  return !PassedType->hasFloatingRepresentation();
+};
+
+if (CheckArgTypeIsCorrect(&SemaRef, Op0, SemaRef.Context.DoubleTy,
+  CheckIsDouble))
+  return true;
+
+Expr *Op1 = TheCall->getArg(1);
+Expr *Op2 = TheCall->getArg(2);
+
+auto CheckIsUint = [](clang::QualType PassedType) -> bool {
+  return !PassedType->hasUnsignedIntegerRepresentation();
+};
+
+if (CheckArgTypeIsCorrect(&SemaRef, Op1, SemaRef.Context.UnsignedIntTy,
+  CheckIsUint) ||
+CheckArgTypeIsCorrect(&SemaRef, Op2, SemaRef.Context.UnsignedIntTy,
+  CheckIsUint))
+  return true;

tex3d wrote:

Since this should only allow `(double, uint, uint)`, I think this code needs to 
be different, like:
```suggestion
if (CheckScalarOrVector(&SemaRef, TheCall, SemaRef.Context.DoubleTy, 0) ||
CheckScalarOrVector(&SemaRef, TheCall, SemaRef.Context.UnsignedIntTy,
1) ||
CheckScalarOrVector(&SemaRef, TheCall, SemaRef.Context.UnsignedIntTy,
2))
  return true;
```

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-22 Thread Tex Riddell via cfe-commits


@@ -0,0 +1,102 @@
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple 
dxil-pc-shadermodel6.3-library %s -fnative-half-type -emit-llvm -O1 -o - | 
FileCheck %s
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple 
spirv-vulkan-library %s -fnative-half-type -emit-llvm -O0 -o - | FileCheck %s 
--check-prefix=SPIRV
+
+
+
+// CHECK: define {{.*}} i32 {{.*}}test_scalar{{.*}}(double {{.*}} [[VALD:%.*]])
+// CHECK: [[VALRET:%.*]] = {{.*}} call { i32, i32 } 
@llvm.dx.splitdouble.i32(double [[VALD]])
+// CHECK-NEXT: extractvalue { i32, i32 } [[VALRET]], 0
+// CHECK-NEXT: extractvalue { i32, i32 } [[VALRET]], 1
+// SPIRV: define spir_func {{.*}} i32 {{.*}}test_scalar{{.*}}(double {{.*}} 
[[VALD:%.*]])
+// SPIRV-NOT: @llvm.dx.splitdouble.i32
+// SPIRV: [[REG:%.*]] = load double, ptr [[VALD]].addr, align 8
+// SPIRV-NEXT: [[CAST:%.*]] = bitcast double [[REG]] to <2 x i32>
+// SPIRV-NEXT: extractelement <2 x i32> [[CAST]], i64 0
+// SPIRV-NEXT: extractelement <2 x i32> [[CAST]], i64 1
+uint test_scalar(double D) {
+  uint A, B;
+  asuint(D, A, B);
+  return A + B;
+}
+
+// CHECK: define {{.*}} i32 {{.*}}test_double1{{.*}}(<1 x double> {{.*}} 
[[VALD:%.*]])
+// CHECK: [[TRUNC:%.*]] = extractelement <1 x double> %D, i64 0
+// CHECK-NEXT: [[VALRET:%.*]] = {{.*}} call { i32, i32 } 
@llvm.dx.splitdouble.i32(double [[TRUNC]])
+// CHECK-NEXT: extractvalue { i32, i32 } [[VALRET]], 0
+// CHECK-NEXT: extractvalue { i32, i32 } [[VALRET]], 1
+// SPIRV: define spir_func {{.*}} i32 {{.*}}test_double1{{.*}}(<1 x double> 
{{.*}} [[VALD:%.*]])
+// SPIRV-NOT: @llvm.dx.splitdouble.i32
+// SPIRV: [[REG:%.*]] = load <1 x double>, ptr [[VALD]].addr, align 8
+// SPIRV-NEXT: [[TRUNC:%.*]] = extractelement <1 x double> %1, i64 0
+// SPIRV-NEXT: [[CAST:%.*]] = bitcast double [[TRUNC]] to <2 x i32>
+// SPIRV-NEXT: extractelement <2 x i32> [[CAST]], i64 0
+// SPIRV-NEXT: extractelement <2 x i32> [[CAST]], i64 1
+uint test_double1(double1 D) {
+  uint A, B;
+  asuint(D, A, B);
+  return A + B;
+}
+
+// CHECK: define {{.*}} <2 x i32> {{.*}}test_vector2{{.*}}(<2 x double> {{.*}} 
[[VALD:%.*]])
+// CHECK: [[VALRET:%.*]] = {{.*}} call { <2 x i32>, <2 x i32> } 
@llvm.dx.splitdouble.v2i32(<2 x double> [[VALD]])
+// CHECK-NEXT: extractvalue { <2 x i32>, <2 x i32> } [[VALRET]], 0
+// CHECK-NEXT: extractvalue { <2 x i32>, <2 x i32> } [[VALRET]], 1
+// SPIRV: define spir_func {{.*}} <2 x i32> {{.*}}test_vector2{{.*}}(<2 x 
double> {{.*}} [[VALD:%.*]])
+// SPIRV-NOT: @llvm.dx.splitdouble.i32
+// SPIRV: [[REG:%.*]] = load <2 x double>, ptr [[VALD]].addr, align 16
+// SPIRV-NEXT: [[CAST1:%.*]] = bitcast <2 x double> [[REG]] to <4 x i32>
+// SPIRV-NEXT: [[SHUF1:%.*]] = shufflevector <4 x i32> [[CAST1]], <4 x i32> 
poison, <2 x i32> 
+// SPIRV-NEXT: [[SHUF2:%.*]] = shufflevector <4 x i32> [[CAST1]], <4 x i32> 
poison, <2 x i32> 
+uint2 test_vector2(double2 D) {
+  uint2 A, B;
+  asuint(D, A, B);
+  return A + B;
+}
+
+// CHECK: define {{.*}} <3 x i32> {{.*}}test_vector3{{.*}}(<3 x double> {{.*}} 
[[VALD:%.*]])
+// CHECK: [[VALRET:%.*]] = {{.*}} call { <3 x i32>, <3 x i32> } 
@llvm.dx.splitdouble.v3i32(<3 x double> [[VALD]])
+// CHECK-NEXT: extractvalue { <3 x i32>, <3 x i32> } [[VALRET]], 0
+// CHECK-NEXT: extractvalue { <3 x i32>, <3 x i32> } [[VALRET]], 1
+// SPIRV: define spir_func {{.*}} <3 x i32> {{.*}}test_vector3{{.*}}(<3 x 
double> {{.*}} [[VALD:%.*]])
+// SPIRV-NOT: @llvm.dx.splitdouble.i32
+// SPIRV: [[REG:%.*]] = load <3 x double>, ptr [[VALD]].addr, align 32
+// SPIRV-NEXT: [[VALRET1:%.*]] = shufflevector <3 x double> [[REG]], <3 x 
double> poison, <2 x i32> 
+// SPIRV-NEXT: [[CAST1:%.*]] = bitcast <2 x double> [[VALRET1]] to <4 x i32>
+// SPIRV-NEXT: [[SHUF1:%.*]] = shufflevector <4 x i32> [[CAST1]], <4 x i32> 
poison, <2 x i32> 
+// SPIRV-NEXT: [[SHUF2:%.*]] = shufflevector <4 x i32> [[CAST1]], <4 x i32> 
poison, <2 x i32> 
+// SPIRV-NEXT: [[VALRET2:%.*]] = shufflevector <3 x double> [[REG]], <3 x 
double> poison, <1 x i32> 
+// SPIRV-NEXT: [[CAST2:%.*]] = bitcast <1 x double> [[VALRET2]] to <2 x i32>
+// SPIRV-NEXT: [[SHUF3:%.*]] = shufflevector <2 x i32> [[CAST2]], <2 x i32> 
poison, <1 x i32> zeroinitializer
+// SPIRV-NEXT: [[SHUF4:%.*]] = shufflevector <2 x i32> [[CAST2]], <2 x i32> 
poison, <1 x i32> 
+// SPIRV-NEXT: [[SHUF5:%.*]] = shufflevector <1 x i32> [[SHUF3]], <1 x i32> 
poison, <2 x i32> zeroinitializer
+// SPIRV-NEXT: [[SHUF6:%.*]] = shufflevector <1 x i32> [[SHUF4]], <1 x i32> 
poison, <2 x i32> zeroinitializer
+// SPIRV-NEXT: shufflevector <2 x i32> %4, <2 x i32> [[SHUF5]], <3 x i32> 
+// SPIRV-NEXT: shufflevector <2 x i32> %5, <2 x i32> [[SHUF6]], <3 x i32> 

tex3d wrote:

This is different than the sequence I expected.

For one thing, vector of size 1 will generate invalid SPIR-V IR, which will 
fail SPIRV validation.  You have to use extractelement for the last element, 
then a scalar double to vector i32 bitcast, then shuffles.

For another, I thought you were going to combine t

[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-22 Thread Tex Riddell via cfe-commits


@@ -0,0 +1,50 @@
+; RUN: llc -verify-machineinstrs -O0 -mtriple=spirv-unknown-unknown %s -o - | 
FileCheck %s
+; RUN: %if spirv-tools %{ llc -O0 -mtriple=spirv-unknown-unknown %s -o - 
-filetype=obj | spirv-val %}
+
+; Make sure lowering is correctly generating spirv code.
+
+; CHECK-DAG: %[[#double:]] = OpTypeFloat 64
+; CHECK-DAG: %[[#int_32:]] = OpTypeInt 32 0
+; CHECK-DAG: %[[#scalar_function:]] = OpTypeFunction %[[#int_32]] %[[#double]]
+; CHECK-DAG: %[[#vec_2_int_32:]] = OpTypeVector %[[#int_32]] 2
+; CHECK-DAG: %[[#vec_4_int_32:]] = OpTypeVector %[[#int_32]] 4
+; CHECK-DAG: %[[#vec_3_int_32:]] = OpTypeVector %[[#int_32]] 3
+; CHECK-DAG: %[[#vec_3_double:]] = OpTypeVector %[[#double]] 3
+; CHECK-DAG: %[[#vector_function:]] = OpTypeFunction %[[#vec_3_int_32]] 
%[[#vec_3_double]]
+; CHECK-DAG: %[[#vec_2_double:]] = OpTypeVector %[[#double]] 2
+
+
+define spir_func noundef i32 @test_scalar(double noundef %D) 
local_unnamed_addr {
+entry:
+  ; CHECK: %[[#]] = OpFunction %[[#int_32]] None %[[#scalar_function]]
+  ; CHECK: %[[#param:]] = OpFunctionParameter %[[#double]]
+  ; CHECK: %[[#bitcast:]] = OpBitcast %[[#vec_2_int_32]] %[[#param]]
+  %0 = bitcast double %D to <2 x i32>
+  ; CHECK: %[[#]] = OpCompositeExtract %[[#int_32:]] %[[#bitcast]] 0
+  %1 = extractelement <2 x i32> %0, i64 0
+  ; CHECK: %[[#]] = OpCompositeExtract %[[#int_32:]] %[[#bitcast]] 1
+  %2 = extractelement <2 x i32> %0, i64 1
+  %add = add i32 %1, %2
+  ret i32 %add
+}
+
+
+define spir_func noundef <3 x i32> @test_vector(<3 x double> noundef %D) 
local_unnamed_addr {
+entry:
+  ; CHECK: %[[#]] = OpFunction %[[#vec_3_int_32]] None %[[#vector_function]]
+  ; CHECK: %[[#param:]] = OpFunctionParameter %[[#vec_3_double]]
+  ; CHECK: %[[#shuf1:]] = OpVectorShuffle %[[#vec_2_double]] %[[#param]] 
%[[#]] 0 1
+  %0 = shufflevector <3 x double> %D, <3 x double> poison, <2 x i32> 
+  ; CHECK: %[[#shuf2:]] = OpVectorShuffle %[[#vec_2_double]] %[[#param]] 
%[[#]] 2 0  
+  %1 = shufflevector <3 x double> %D, <3 x double> poison, <2 x i32> 
+  ; CHECK: %[[#cast1:]] = OpBitcast %[[#vec_4_int_32]] %[[#shuf1]]  
+  %2 = bitcast <2 x double> %0 to <4 x i32>
+  ; CHECK: %[[#cast2:]] = OpBitcast %[[#vec_4_int_32]] %[[#shuf2]]  
+  %3 = bitcast <2 x double> %1 to <4 x i32>
+  ; CHECK: %[[#]] = OpVectorShuffle %[[#vec_3_int_32]] %[[#cast1]] %[[#cast2]] 
0 2 4  
+  %4 = shufflevector <4 x i32> %2, <4 x i32> %3, <3 x i32> 
+  ; CHECK: %[[#]] = OpVectorShuffle %[[#vec_3_int_32]] %[[#cast1]] %[[#cast2]] 
1 3 5  
+  %5 = shufflevector <4 x i32> %2, <4 x i32> %3, <3 x i32> 
+  %add = add <3 x i32> %4, %5
+  ret <3 x i32> %add
+}

tex3d wrote:

This test code is not in sync with the current code generated by 
`CGBuiltin.cpp`.  If you try that result, you'll see that `spirv-val` will fail 
due to the size-1 vectors.  I would expect this to look more like:

```suggestion
define spir_func noundef <3 x i32> @test_vector3(<3 x double> noundef %D) 
local_unnamed_addr {
entry:
  ; CHECK-LABEL: ; -- Begin function test_vector3
  ; CHECK: %[[#param:]] = OpFunctionParameter %[[#vec_3_double]]
  ; CHECK: %[[#shuf1:]] = OpVectorShuffle %[[#vec_2_double]] %[[#param]] %[[#]] 
0 1
  %0 = shufflevector <3 x double> %D, <3 x double> poison, <2 x i32> 
  ; CHECK: %[[#extract2:]] = OpCompositeExtract %[[#double]] %[[#param]] 2
  %1 = extractelement <3 x double> %D, i32 2
  ; CHECK: %[[#cast1:]] = OpBitcast %[[#vec_4_int_32]] %[[#shuf1]]
  %2 = bitcast <2 x double> %0 to <4 x i32>
  ; CHECK: %[[#cast2:]] = OpBitcast %[[#vec_2_int_32]] %[[#extract2]]
  %3 = bitcast double %1 to <2 x i32>
  ; CHECK: %[[#shuf3:]] = OpVectorShuffle %[[#vec_4_int_32]] %[[#cast2]] %[[#]] 
0 1 2 3
  %4 = shufflevector <2 x i32> %3, <2 x i32> poison, <4 x i32> 
  ; CHECK: %[[#]] = OpVectorShuffle %[[#vec_3_int_32]] %[[#cast1]] %[[#shuf3]] 
0 2 4
  %high = shufflevector <4 x i32> %2, <4 x i32> %4, <3 x i32> 
  ; CHECK: %[[#]] = OpVectorShuffle %[[#vec_3_int_32]] %[[#cast1]] %[[#shuf3]] 
1 3 5
  %low = shufflevector <4 x i32> %2, <4 x i32> %4, <3 x i32> 
  %add = add <3 x i32> %high, %low
  ret <3 x i32> %add
}
```

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-22 Thread Tex Riddell via cfe-commits


@@ -18952,6 +18955,142 @@ case Builtin::BI__builtin_hlsl_elementwise_isinf: {
 CGM.getHLSLRuntime().getRadiansIntrinsic(), ArrayRef{Op0},
 nullptr, "hlsl.radians");
   }
+  case Builtin::BI__builtin_hlsl_splitdouble: {
+
+assert((E->getArg(0)->getType()->hasFloatingRepresentation() &&
+E->getArg(1)->getType()->hasUnsignedIntegerRepresentation() &&
+E->getArg(2)->getType()->hasUnsignedIntegerRepresentation()) &&
+   "asuint operands types mismatch");
+Value *Op0 = EmitScalarExpr(E->getArg(0));
+const auto *OutArg1 = dyn_cast(E->getArg(1));
+const auto *OutArg2 = dyn_cast(E->getArg(2));
+
+CallArgList Args;
+LValue Op1TmpLValue = EmitHLSLOutArgExpr(OutArg1, Args, 
OutArg1->getType());
+LValue Op2TmpLValue = EmitHLSLOutArgExpr(OutArg2, Args, 
OutArg2->getType());
+
+if (getTarget().getCXXABI().areArgsDestroyedLeftToRightInCallee())
+  Args.reverseWritebacks();
+
+auto EmitVectorCode =
+[](Value *Op, CGBuilderTy *Builder,
+   FixedVectorType *DestTy) -> std::pair {
+  Value *bitcast = Builder->CreateBitCast(Op, DestTy);
+
+  SmallVector LowbitsIndex;
+  SmallVector HighbitsIndex;
+
+  for (unsigned int Idx = 0; Idx < DestTy->getNumElements(); Idx += 2) {
+LowbitsIndex.push_back(Idx);
+HighbitsIndex.push_back(Idx + 1);
+  }
+
+  Value *Arg0 = Builder->CreateShuffleVector(bitcast, LowbitsIndex);
+  Value *Arg1 = Builder->CreateShuffleVector(bitcast, HighbitsIndex);
+
+  return std::make_pair(Arg0, Arg1);
+};
+
+Value *LastInst = nullptr;
+
+if (CGM.getTarget().getTriple().isDXIL()) {
+
+  llvm::Type *RetElementTy = Int32Ty;
+  if (auto *Op0VecTy = E->getArg(0)->getType()->getAs())
+RetElementTy = llvm::VectorType::get(
+Int32Ty, ElementCount::getFixed(Op0VecTy->getNumElements()));
+  auto *RetTy = llvm::StructType::get(RetElementTy, RetElementTy);
+
+  CallInst *CI = Builder.CreateIntrinsic(
+  RetTy, Intrinsic::dx_splitdouble, {Op0}, nullptr, 
"hlsl.splitdouble");
+
+  Value *Arg0 = Builder.CreateExtractValue(CI, 0);
+  Value *Arg1 = Builder.CreateExtractValue(CI, 1);
+
+  Builder.CreateStore(Arg0, Op1TmpLValue.getAddress());
+  LastInst = Builder.CreateStore(Arg1, Op2TmpLValue.getAddress());
+
+} else {
+
+  assert(!CGM.getTarget().getTriple().isDXIL() &&
+ "For non-DXIL targets we generate the instructions");

tex3d wrote:

BTW, I suspect the previous comment was on a slightly different code pattern, 
more like:
```cpp
if (a) {
  //... do some stuff ...
  return;
}

assert(!a);
```
This is slightly different because although the assert is after an if block 
that returns, forcing `!a` to always be true in practice, the code block 
containing the assert isn't syntactically tied to the condition the way it is 
when inside the `else` block.

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Emit constrained atan2 intrinsic for clang builtin (PR #113636)

2024-10-28 Thread Tex Riddell via cfe-commits

https://github.com/tex3d updated 
https://github.com/llvm/llvm-project/pull/113636

>From 7c7b72b48e07e0f34c2ee65e11e70db37f8c88b3 Mon Sep 17 00:00:00 2001
From: Tex Riddell 
Date: Tue, 15 Oct 2024 16:18:44 -0700
Subject: [PATCH 1/3] Emit constrained atan2 intrinsic for clang builtin

This change is part of this proposal: 
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

- `Builtins.td` - Add f16 support for libm atan2 builtin
- `CGBuiltin.cpp` - Emit constraint atan2 intrinsic for clang builtin

Part of Implement the atan2 HLSL Function #70096.
---
 clang/include/clang/Basic/Builtins.td |  6 +++---
 clang/lib/CodeGen/CGBuiltin.cpp   | 11 ++
 clang/test/CodeGen/X86/math-builtins.c| 14 ++---
 .../test/CodeGen/constrained-math-builtins.c  |  7 +++
 clang/test/CodeGen/libcalls.c |  7 +++
 clang/test/CodeGen/math-libcalls.c| 20 +--
 .../test/CodeGenCXX/builtin-calling-conv.cpp  | 10 +-
 clang/test/CodeGenOpenCL/builtins-f16.cl  |  3 +++
 8 files changed, 49 insertions(+), 29 deletions(-)

diff --git a/clang/include/clang/Basic/Builtins.td 
b/clang/include/clang/Basic/Builtins.td
index 90475a361bb8f8..0d77f4105bb757 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -227,10 +227,10 @@ def FminimumNumF16F128 : Builtin, F16F128MathTemplate {
   let Prototype = "T(T, T)";
 }
 
-def Atan2F128 : Builtin {
-  let Spellings = ["__builtin_atan2f128"];
+def Atan2F16F128 : Builtin, F16F128MathTemplate {
+  let Spellings = ["__builtin_atan2"];
   let Attributes = [FunctionWithBuiltinPrefix, NoThrow, 
ConstIgnoringErrnoAndExceptions];
-  let Prototype = "__float128(__float128, __float128)";
+  let Prototype = "T(T, T)";
 }
 
 def CopysignF16 : Builtin {
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index a57c95d5b96672..012097e5bd72ee 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -2724,6 +2724,17 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl 
GD, unsigned BuiltinID,
   return RValue::get(emitUnaryMaybeConstrainedFPBuiltin(
   *this, E, Intrinsic::atan, 
Intrinsic::experimental_constrained_atan));
 
+case Builtin::BIatan2:
+case Builtin::BIatan2f:
+case Builtin::BIatan2l:
+case Builtin::BI__builtin_atan2:
+case Builtin::BI__builtin_atan2f:
+case Builtin::BI__builtin_atan2f16:
+case Builtin::BI__builtin_atan2l:
+case Builtin::BI__builtin_atan2f128:
+  return RValue::get(emitBinaryMaybeConstrainedFPBuiltin(
+  *this, E, Intrinsic::atan2, 
Intrinsic::experimental_constrained_atan2));
+
 case Builtin::BIceil:
 case Builtin::BIceilf:
 case Builtin::BIceill:
diff --git a/clang/test/CodeGen/X86/math-builtins.c 
b/clang/test/CodeGen/X86/math-builtins.c
index 48465df21cca19..bf107437fc63a3 100644
--- a/clang/test/CodeGen/X86/math-builtins.c
+++ b/clang/test/CodeGen/X86/math-builtins.c
@@ -45,10 +45,10 @@ void foo(double *d, float f, float *fp, long double *l, int 
*i, const char *c) {
 
   __builtin_atan2(f,f);__builtin_atan2f(f,f) ;  __builtin_atan2l(f, f); 
__builtin_atan2f128(f,f);
 
-// NO__ERRNO: declare double @atan2(double noundef, double noundef) 
[[READNONE:#[0-9]+]]
-// NO__ERRNO: declare float @atan2f(float noundef, float noundef) [[READNONE]]
-// NO__ERRNO: declare x86_fp80 @atan2l(x86_fp80 noundef, x86_fp80 noundef) 
[[READNONE]]
-// NO__ERRNO: declare fp128 @atan2f128(fp128 noundef, fp128 noundef) 
[[READNONE]]
+// NO__ERRNO: declare double @llvm.atan2.f64(double, double) 
[[READNONE_INTRINSIC:#[0-9]+]]
+// NO__ERRNO: declare float @llvm.atan2.f32(float, float) 
[[READNONE_INTRINSIC]]
+// NO__ERRNO: declare x86_fp80 @llvm.atan2.f80(x86_fp80, x86_fp80) 
[[READNONE_INTRINSIC]]
+// NO__ERRNO: declare fp128 @llvm.atan2.f128(fp128, fp128) 
[[READNONE_INTRINSIC]]
 // HAS_ERRNO: declare double @atan2(double noundef, double noundef) 
[[NOT_READNONE]]
 // HAS_ERRNO: declare float @atan2f(float noundef, float noundef) 
[[NOT_READNONE]]
 // HAS_ERRNO: declare x86_fp80 @atan2l(x86_fp80 noundef, x86_fp80 noundef) 
[[NOT_READNONE]]
@@ -56,7 +56,7 @@ void foo(double *d, float f, float *fp, long double *l, int 
*i, const char *c) {
 
   __builtin_copysign(f,f); __builtin_copysignf(f,f); __builtin_copysignl(f,f); 
__builtin_copysignf128(f,f);
 
-// NO__ERRNO: declare double @llvm.copysign.f64(double, double) 
[[READNONE_INTRINSIC:#[0-9]+]]
+// NO__ERRNO: declare double @llvm.copysign.f64(double, double) 
[[READNONE_INTRINSIC]]
 // NO__ERRNO: declare float @llvm.copysign.f32(float, float) 
[[READNONE_INTRINSIC]]
 // NO__ERRNO: declare x86_fp80 @llvm.copysign.f80(x86_fp80, x86_fp80) 
[[READNONE_INTRINSIC]]
 // NO__ERRNO: declare fp128 @llvm.copysign.f128(fp128, fp128) 
[[READNONE_INTRINSIC]]
@@ -179,7 +179,7 @@ void foo(double *d, float f, float *fp, long double *l, int 
*i, const char *c) {
 
   __builtin_acosh(f

[clang] Emit constrained atan2 intrinsic for clang builtin (PR #113636)

2024-10-24 Thread Tex Riddell via cfe-commits

https://github.com/tex3d created 
https://github.com/llvm/llvm-project/pull/113636

This change is part of this proposal: 
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

- `Builtins.td` - Add f16 support for libm atan2 builtin
- `CGBuiltin.cpp` - Emit constraint atan2 intrinsic for clang builtin

Part of Implement the atan2 HLSL Function 
https://github.com/llvm/llvm-project/issues/70096.

>From 0c9dfb67a7371b9c4087d7b54e6f93e780038117 Mon Sep 17 00:00:00 2001
From: Tex Riddell 
Date: Tue, 15 Oct 2024 16:18:44 -0700
Subject: [PATCH 1/2] Emit constrained atan2 intrinsic for clang builtin

This change is part of this proposal: 
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

- `Builtins.td` - Add f16 support for libm atan2 builtin
- `CGBuiltin.cpp` - Emit constraint atan2 intrinsic for clang builtin

Part of Implement the atan2 HLSL Function #70096.
---
 clang/include/clang/Basic/Builtins.td |  6 +++---
 clang/lib/CodeGen/CGBuiltin.cpp   | 11 ++
 clang/test/CodeGen/X86/math-builtins.c| 14 ++---
 .../test/CodeGen/constrained-math-builtins.c  |  7 +++
 clang/test/CodeGen/libcalls.c |  7 +++
 clang/test/CodeGen/math-libcalls.c| 20 +--
 .../test/CodeGenCXX/builtin-calling-conv.cpp  | 10 +-
 clang/test/CodeGenOpenCL/builtins-f16.cl  |  3 +++
 8 files changed, 49 insertions(+), 29 deletions(-)

diff --git a/clang/include/clang/Basic/Builtins.td 
b/clang/include/clang/Basic/Builtins.td
index 90475a361bb8f8..0d77f4105bb757 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -227,10 +227,10 @@ def FminimumNumF16F128 : Builtin, F16F128MathTemplate {
   let Prototype = "T(T, T)";
 }
 
-def Atan2F128 : Builtin {
-  let Spellings = ["__builtin_atan2f128"];
+def Atan2F16F128 : Builtin, F16F128MathTemplate {
+  let Spellings = ["__builtin_atan2"];
   let Attributes = [FunctionWithBuiltinPrefix, NoThrow, 
ConstIgnoringErrnoAndExceptions];
-  let Prototype = "__float128(__float128, __float128)";
+  let Prototype = "T(T, T)";
 }
 
 def CopysignF16 : Builtin {
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index e2d03eff8ab4a0..0bec8f32552110 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -2724,6 +2724,17 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl 
GD, unsigned BuiltinID,
   return RValue::get(emitUnaryMaybeConstrainedFPBuiltin(
   *this, E, Intrinsic::atan, 
Intrinsic::experimental_constrained_atan));
 
+case Builtin::BIatan2:
+case Builtin::BIatan2f:
+case Builtin::BIatan2l:
+case Builtin::BI__builtin_atan2:
+case Builtin::BI__builtin_atan2f:
+case Builtin::BI__builtin_atan2f16:
+case Builtin::BI__builtin_atan2l:
+case Builtin::BI__builtin_atan2f128:
+  return RValue::get(emitBinaryMaybeConstrainedFPBuiltin(
+  *this, E, Intrinsic::atan2, 
Intrinsic::experimental_constrained_atan2));
+
 case Builtin::BIceil:
 case Builtin::BIceilf:
 case Builtin::BIceill:
diff --git a/clang/test/CodeGen/X86/math-builtins.c 
b/clang/test/CodeGen/X86/math-builtins.c
index 48465df21cca19..bf107437fc63a3 100644
--- a/clang/test/CodeGen/X86/math-builtins.c
+++ b/clang/test/CodeGen/X86/math-builtins.c
@@ -45,10 +45,10 @@ void foo(double *d, float f, float *fp, long double *l, int 
*i, const char *c) {
 
   __builtin_atan2(f,f);__builtin_atan2f(f,f) ;  __builtin_atan2l(f, f); 
__builtin_atan2f128(f,f);
 
-// NO__ERRNO: declare double @atan2(double noundef, double noundef) 
[[READNONE:#[0-9]+]]
-// NO__ERRNO: declare float @atan2f(float noundef, float noundef) [[READNONE]]
-// NO__ERRNO: declare x86_fp80 @atan2l(x86_fp80 noundef, x86_fp80 noundef) 
[[READNONE]]
-// NO__ERRNO: declare fp128 @atan2f128(fp128 noundef, fp128 noundef) 
[[READNONE]]
+// NO__ERRNO: declare double @llvm.atan2.f64(double, double) 
[[READNONE_INTRINSIC:#[0-9]+]]
+// NO__ERRNO: declare float @llvm.atan2.f32(float, float) 
[[READNONE_INTRINSIC]]
+// NO__ERRNO: declare x86_fp80 @llvm.atan2.f80(x86_fp80, x86_fp80) 
[[READNONE_INTRINSIC]]
+// NO__ERRNO: declare fp128 @llvm.atan2.f128(fp128, fp128) 
[[READNONE_INTRINSIC]]
 // HAS_ERRNO: declare double @atan2(double noundef, double noundef) 
[[NOT_READNONE]]
 // HAS_ERRNO: declare float @atan2f(float noundef, float noundef) 
[[NOT_READNONE]]
 // HAS_ERRNO: declare x86_fp80 @atan2l(x86_fp80 noundef, x86_fp80 noundef) 
[[NOT_READNONE]]
@@ -56,7 +56,7 @@ void foo(double *d, float f, float *fp, long double *l, int 
*i, const char *c) {
 
   __builtin_copysign(f,f); __builtin_copysignf(f,f); __builtin_copysignl(f,f); 
__builtin_copysignf128(f,f);
 
-// NO__ERRNO: declare double @llvm.copysign.f64(double, double) 
[[READNONE_INTRINSIC:#[0-9]+]]
+// NO__ERRNO: declare double @llvm.copysign.f64(double, double) 
[[READNONE_INTRINSIC]]
 // NO__ERRNO: declare float @llvm.copysign.f32(float, float) 
[[R

[clang] Emit constrained atan2 intrinsic for clang builtin (PR #113636)

2024-10-24 Thread Tex Riddell via cfe-commits

https://github.com/tex3d updated 
https://github.com/llvm/llvm-project/pull/113636

>From 0c9dfb67a7371b9c4087d7b54e6f93e780038117 Mon Sep 17 00:00:00 2001
From: Tex Riddell 
Date: Tue, 15 Oct 2024 16:18:44 -0700
Subject: [PATCH 1/3] Emit constrained atan2 intrinsic for clang builtin

This change is part of this proposal: 
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

- `Builtins.td` - Add f16 support for libm atan2 builtin
- `CGBuiltin.cpp` - Emit constraint atan2 intrinsic for clang builtin

Part of Implement the atan2 HLSL Function #70096.
---
 clang/include/clang/Basic/Builtins.td |  6 +++---
 clang/lib/CodeGen/CGBuiltin.cpp   | 11 ++
 clang/test/CodeGen/X86/math-builtins.c| 14 ++---
 .../test/CodeGen/constrained-math-builtins.c  |  7 +++
 clang/test/CodeGen/libcalls.c |  7 +++
 clang/test/CodeGen/math-libcalls.c| 20 +--
 .../test/CodeGenCXX/builtin-calling-conv.cpp  | 10 +-
 clang/test/CodeGenOpenCL/builtins-f16.cl  |  3 +++
 8 files changed, 49 insertions(+), 29 deletions(-)

diff --git a/clang/include/clang/Basic/Builtins.td 
b/clang/include/clang/Basic/Builtins.td
index 90475a361bb8f8..0d77f4105bb757 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -227,10 +227,10 @@ def FminimumNumF16F128 : Builtin, F16F128MathTemplate {
   let Prototype = "T(T, T)";
 }
 
-def Atan2F128 : Builtin {
-  let Spellings = ["__builtin_atan2f128"];
+def Atan2F16F128 : Builtin, F16F128MathTemplate {
+  let Spellings = ["__builtin_atan2"];
   let Attributes = [FunctionWithBuiltinPrefix, NoThrow, 
ConstIgnoringErrnoAndExceptions];
-  let Prototype = "__float128(__float128, __float128)";
+  let Prototype = "T(T, T)";
 }
 
 def CopysignF16 : Builtin {
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index e2d03eff8ab4a0..0bec8f32552110 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -2724,6 +2724,17 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl 
GD, unsigned BuiltinID,
   return RValue::get(emitUnaryMaybeConstrainedFPBuiltin(
   *this, E, Intrinsic::atan, 
Intrinsic::experimental_constrained_atan));
 
+case Builtin::BIatan2:
+case Builtin::BIatan2f:
+case Builtin::BIatan2l:
+case Builtin::BI__builtin_atan2:
+case Builtin::BI__builtin_atan2f:
+case Builtin::BI__builtin_atan2f16:
+case Builtin::BI__builtin_atan2l:
+case Builtin::BI__builtin_atan2f128:
+  return RValue::get(emitBinaryMaybeConstrainedFPBuiltin(
+  *this, E, Intrinsic::atan2, 
Intrinsic::experimental_constrained_atan2));
+
 case Builtin::BIceil:
 case Builtin::BIceilf:
 case Builtin::BIceill:
diff --git a/clang/test/CodeGen/X86/math-builtins.c 
b/clang/test/CodeGen/X86/math-builtins.c
index 48465df21cca19..bf107437fc63a3 100644
--- a/clang/test/CodeGen/X86/math-builtins.c
+++ b/clang/test/CodeGen/X86/math-builtins.c
@@ -45,10 +45,10 @@ void foo(double *d, float f, float *fp, long double *l, int 
*i, const char *c) {
 
   __builtin_atan2(f,f);__builtin_atan2f(f,f) ;  __builtin_atan2l(f, f); 
__builtin_atan2f128(f,f);
 
-// NO__ERRNO: declare double @atan2(double noundef, double noundef) 
[[READNONE:#[0-9]+]]
-// NO__ERRNO: declare float @atan2f(float noundef, float noundef) [[READNONE]]
-// NO__ERRNO: declare x86_fp80 @atan2l(x86_fp80 noundef, x86_fp80 noundef) 
[[READNONE]]
-// NO__ERRNO: declare fp128 @atan2f128(fp128 noundef, fp128 noundef) 
[[READNONE]]
+// NO__ERRNO: declare double @llvm.atan2.f64(double, double) 
[[READNONE_INTRINSIC:#[0-9]+]]
+// NO__ERRNO: declare float @llvm.atan2.f32(float, float) 
[[READNONE_INTRINSIC]]
+// NO__ERRNO: declare x86_fp80 @llvm.atan2.f80(x86_fp80, x86_fp80) 
[[READNONE_INTRINSIC]]
+// NO__ERRNO: declare fp128 @llvm.atan2.f128(fp128, fp128) 
[[READNONE_INTRINSIC]]
 // HAS_ERRNO: declare double @atan2(double noundef, double noundef) 
[[NOT_READNONE]]
 // HAS_ERRNO: declare float @atan2f(float noundef, float noundef) 
[[NOT_READNONE]]
 // HAS_ERRNO: declare x86_fp80 @atan2l(x86_fp80 noundef, x86_fp80 noundef) 
[[NOT_READNONE]]
@@ -56,7 +56,7 @@ void foo(double *d, float f, float *fp, long double *l, int 
*i, const char *c) {
 
   __builtin_copysign(f,f); __builtin_copysignf(f,f); __builtin_copysignl(f,f); 
__builtin_copysignf128(f,f);
 
-// NO__ERRNO: declare double @llvm.copysign.f64(double, double) 
[[READNONE_INTRINSIC:#[0-9]+]]
+// NO__ERRNO: declare double @llvm.copysign.f64(double, double) 
[[READNONE_INTRINSIC]]
 // NO__ERRNO: declare float @llvm.copysign.f32(float, float) 
[[READNONE_INTRINSIC]]
 // NO__ERRNO: declare x86_fp80 @llvm.copysign.f80(x86_fp80, x86_fp80) 
[[READNONE_INTRINSIC]]
 // NO__ERRNO: declare fp128 @llvm.copysign.f128(fp128, fp128) 
[[READNONE_INTRINSIC]]
@@ -179,7 +179,7 @@ void foo(double *d, float f, float *fp, long double *l, int 
*i, const char *c) {
 
   __builtin_acosh(f

[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-24 Thread Tex Riddell via cfe-commits

https://github.com/tex3d approved this pull request.


https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-25 Thread Tex Riddell via cfe-commits


@@ -95,6 +99,133 @@ static void initializeAlloca(CodeGenFunction &CGF, 
AllocaInst *AI, Value *Size,
   I->addAnnotationMetadata("auto-init");
 }
 
+static Value *handleHlslSplitdouble(const CallExpr *E, CodeGenFunction *CGF) {
+  Value *Op0 = CGF->EmitScalarExpr(E->getArg(0));
+  const auto *OutArg1 = dyn_cast(E->getArg(1));
+  const auto *OutArg2 = dyn_cast(E->getArg(2));
+
+  CallArgList Args;
+  LValue Op1TmpLValue =
+  CGF->EmitHLSLOutArgExpr(OutArg1, Args, OutArg1->getType());
+  LValue Op2TmpLValue =
+  CGF->EmitHLSLOutArgExpr(OutArg2, Args, OutArg2->getType());
+
+  if (CGF->getTarget().getCXXABI().areArgsDestroyedLeftToRightInCallee())
+Args.reverseWritebacks();
+
+  auto EmitVectorCode =
+  [](Value *Op, CGBuilderTy *Builder,
+ FixedVectorType *DestTy) -> std::pair {
+Value *bitcast = Builder->CreateBitCast(Op, DestTy);
+
+SmallVector LowbitsIndex;
+SmallVector HighbitsIndex;
+
+for (unsigned int Idx = 0; Idx < DestTy->getNumElements(); Idx += 2) {
+  LowbitsIndex.push_back(Idx);
+  HighbitsIndex.push_back(Idx + 1);
+}
+
+Value *Arg0 = Builder->CreateShuffleVector(bitcast, LowbitsIndex);
+Value *Arg1 = Builder->CreateShuffleVector(bitcast, HighbitsIndex);
+
+return std::make_pair(Arg0, Arg1);
+  };

tex3d wrote:

Though you might want to skip this simplification if a generalized version of 
this is used to handle any size vectors with constraints being applied later 
for SPIR-V.

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL][NFC] Cleanup - removed unused function, includes and param, fix typos (PR #113649)

2024-11-04 Thread Tex Riddell via cfe-commits

https://github.com/tex3d approved this pull request.


https://github.com/llvm/llvm-project/pull/113649
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Emit constrained atan2 intrinsic for clang builtin (PR #113636)

2024-11-04 Thread Tex Riddell via cfe-commits

https://github.com/tex3d edited https://github.com/llvm/llvm-project/pull/113636
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Emit constrained atan2 intrinsic for clang builtin (PR #113636)

2024-11-04 Thread Tex Riddell via cfe-commits

https://github.com/tex3d edited https://github.com/llvm/llvm-project/pull/113636
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] Change StructuredBuffer resource class to SRV (PR #113397)

2024-10-23 Thread Tex Riddell via cfe-commits

https://github.com/tex3d approved this pull request.


https://github.com/llvm/llvm-project/pull/113397
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] Add RWStructuredBuffer definition to HLSLExternalSemaSource (PR #113477)

2024-10-23 Thread Tex Riddell via cfe-commits


@@ -17,7 +17,7 @@
 // EMPTY-NEXT: FinalAttr 0x{{[0-9A-Fa-f]+}} <> Implicit final
 
 // There should be no more occurrances of StructuredBuffer
-// EMPTY-NOT: StructuredBuffer
+// EMPTY-NOT: {{/s}}StructuredBuffer

tex3d wrote:

How do you know that "/s" is space?  I didn't know that, and I can't find that 
in POSIX extended regex docs.  I would think you need to use character class 
like `[:space:]` instead.

https://github.com/llvm/llvm-project/pull/113477
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] Add RWStructuredBuffer definition to HLSLExternalSemaSource (PR #113477)

2024-10-23 Thread Tex Riddell via cfe-commits


@@ -0,0 +1,70 @@
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.2-compute 
-finclude-default-header -fnative-half-type -emit-llvm -o - %s | FileCheck %s
+
+// NOTE: The number in type name and whether the struct is packed or not will 
mostly
+// likely change once subscript operators are properly implemented 
(llvm/llvm-project#95956) 
+// and theinterim field of the contained type is removed.
+
+// CHECK: %"class.hlsl::RWStructuredBuffer" = type <{ target("dx.RawBuffer", 
i16, 1, 0)
+// CHECK: %"class.hlsl::RWStructuredBuffer.0" = type <{ target("dx.RawBuffer", 
i16, 1, 0)
+// CHECK: %"class.hlsl::RWStructuredBuffer.2" = type { target("dx.RawBuffer", 
i32, 1, 0)
+// CHECK: %"class.hlsl::RWStructuredBuffer.3" = type { target("dx.RawBuffer", 
i32, 1, 0)
+// CHECK: %"class.hlsl::RWStructuredBuffer.4" = type { target("dx.RawBuffer", 
i64, 1, 0)
+// CHECK: %"class.hlsl::RWStructuredBuffer.5" = type { target("dx.RawBuffer", 
i64, 1, 0)
+// CHECK: %"class.hlsl::RWStructuredBuffer.6" = type <{ target("dx.RawBuffer", 
half, 1, 0) 
+// CHECK: %"class.hlsl::RWStructuredBuffer.8" = type { target("dx.RawBuffer", 
float, 1, 0)
+// CHECK: %"class.hlsl::RWStructuredBuffer.9" = type { target("dx.RawBuffer", 
double, 1, 0)
+// CHECK: %"class.hlsl::RWStructuredBuffer.10" = type { target("dx.RawBuffer", 
<4 x i16>, 1, 0)
+// CHECK: %"class.hlsl::RWStructuredBuffer.11" = type { target("dx.RawBuffer", 
<3 x i32>, 1, 0)
+// CHECK: %"class.hlsl::RWStructuredBuffer.12" = type { target("dx.RawBuffer", 
<2 x half>, 1, 0)
+// CHECK: %"class.hlsl::RWStructuredBuffer.13" = type { target("dx.RawBuffer", 
<3 x float>, 1, 0)
+
+RWStructuredBuffer BufI16;
+RWStructuredBuffer BufU16;
+RWStructuredBuffer BufI32;
+RWStructuredBuffer BufU32;
+RWStructuredBuffer BufI64;
+RWStructuredBuffer BufU64;
+RWStructuredBuffer BufF16;
+RWStructuredBuffer BufF32;
+RWStructuredBuffer BufF64;
+RWStructuredBuffer< vector > BufI16x4;
+RWStructuredBuffer< vector > BufU32x3;
+RWStructuredBuffer BufF16x2;
+RWStructuredBuffer BufF32x3;
+// TODO: RWStructuredBuffer BufSNormF16; -> 11

tex3d wrote:

I also found that a little confusing.  The numbers are simply an artifact of 
disambiguating identically-named types, and is already commented above that 
check block.  I don't know if listing specific numbers here is helpful, or just 
confusing.

https://github.com/llvm/llvm-project/pull/113477
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] Add RWStructuredBuffer definition to HLSLExternalSemaSource (PR #113477)

2024-10-23 Thread Tex Riddell via cfe-commits


@@ -17,7 +17,7 @@
 // EMPTY-NEXT: FinalAttr 0x{{[0-9A-Fa-f]+}} <> Implicit final
 
 // There should be no more occurrances of StructuredBuffer
-// EMPTY-NOT: StructuredBuffer
+// EMPTY-NOT: {{/s}}StructuredBuffer

tex3d wrote:

Oh yeah, I think you could also use `{{[^W]}}` to exclude matches of 
`WStructuredBuffer` instead of specifically looking for space.  What if you 
have `'StructuredBuffer ...'`, with `'` or something other than space right 
before it?

https://github.com/llvm/llvm-project/pull/113477
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Emit constrained atan2 intrinsic for clang builtin (PR #113636)

2024-11-11 Thread Tex Riddell via cfe-commits

tex3d wrote:

@farzonl 
> Here are just a handful of tests that we might want to update, but there are 
> others:

I'm updating these tests that were called out.  What others do I need to look 
for?  How do I know if I've found all the tests I need to update?

https://github.com/llvm/llvm-project/pull/113636
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Emit constrained atan2 intrinsic for clang builtin (PR #113636)

2024-11-11 Thread Tex Riddell via cfe-commits

https://github.com/tex3d updated 
https://github.com/llvm/llvm-project/pull/113636

>From a6776121bb118fe4083ccb94fa582cca1aef7f9b Mon Sep 17 00:00:00 2001
From: Tex Riddell 
Date: Tue, 15 Oct 2024 16:18:44 -0700
Subject: [PATCH 1/4] Emit constrained atan2 intrinsic for clang builtin

This change is part of this proposal: 
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

- `Builtins.td` - Add f16 support for libm atan2 builtin
- `CGBuiltin.cpp` - Emit constraint atan2 intrinsic for clang builtin

Part of Implement the atan2 HLSL Function #70096.
---
 clang/include/clang/Basic/Builtins.td |  6 +++---
 clang/lib/CodeGen/CGBuiltin.cpp   | 11 ++
 clang/test/CodeGen/X86/math-builtins.c| 14 ++---
 .../test/CodeGen/constrained-math-builtins.c  |  7 +++
 clang/test/CodeGen/libcalls.c |  7 +++
 clang/test/CodeGen/math-libcalls.c| 20 +--
 .../test/CodeGenCXX/builtin-calling-conv.cpp  | 10 +-
 clang/test/CodeGenOpenCL/builtins-f16.cl  |  3 +++
 8 files changed, 49 insertions(+), 29 deletions(-)

diff --git a/clang/include/clang/Basic/Builtins.td 
b/clang/include/clang/Basic/Builtins.td
index 87a798183d6e19..305b085f69420a 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -227,10 +227,10 @@ def FminimumNumF16F128 : Builtin, F16F128MathTemplate {
   let Prototype = "T(T, T)";
 }
 
-def Atan2F128 : Builtin {
-  let Spellings = ["__builtin_atan2f128"];
+def Atan2F16F128 : Builtin, F16F128MathTemplate {
+  let Spellings = ["__builtin_atan2"];
   let Attributes = [FunctionWithBuiltinPrefix, NoThrow, 
ConstIgnoringErrnoAndExceptions];
-  let Prototype = "__float128(__float128, __float128)";
+  let Prototype = "T(T, T)";
 }
 
 def CopysignF16 : Builtin {
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 5c3df5124517d6..9b63fcbedc8c45 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -2798,6 +2798,17 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl 
GD, unsigned BuiltinID,
   return RValue::get(emitUnaryMaybeConstrainedFPBuiltin(
   *this, E, Intrinsic::atan, 
Intrinsic::experimental_constrained_atan));
 
+case Builtin::BIatan2:
+case Builtin::BIatan2f:
+case Builtin::BIatan2l:
+case Builtin::BI__builtin_atan2:
+case Builtin::BI__builtin_atan2f:
+case Builtin::BI__builtin_atan2f16:
+case Builtin::BI__builtin_atan2l:
+case Builtin::BI__builtin_atan2f128:
+  return RValue::get(emitBinaryMaybeConstrainedFPBuiltin(
+  *this, E, Intrinsic::atan2, 
Intrinsic::experimental_constrained_atan2));
+
 case Builtin::BIceil:
 case Builtin::BIceilf:
 case Builtin::BIceill:
diff --git a/clang/test/CodeGen/X86/math-builtins.c 
b/clang/test/CodeGen/X86/math-builtins.c
index 48465df21cca19..bf107437fc63a3 100644
--- a/clang/test/CodeGen/X86/math-builtins.c
+++ b/clang/test/CodeGen/X86/math-builtins.c
@@ -45,10 +45,10 @@ void foo(double *d, float f, float *fp, long double *l, int 
*i, const char *c) {
 
   __builtin_atan2(f,f);__builtin_atan2f(f,f) ;  __builtin_atan2l(f, f); 
__builtin_atan2f128(f,f);
 
-// NO__ERRNO: declare double @atan2(double noundef, double noundef) 
[[READNONE:#[0-9]+]]
-// NO__ERRNO: declare float @atan2f(float noundef, float noundef) [[READNONE]]
-// NO__ERRNO: declare x86_fp80 @atan2l(x86_fp80 noundef, x86_fp80 noundef) 
[[READNONE]]
-// NO__ERRNO: declare fp128 @atan2f128(fp128 noundef, fp128 noundef) 
[[READNONE]]
+// NO__ERRNO: declare double @llvm.atan2.f64(double, double) 
[[READNONE_INTRINSIC:#[0-9]+]]
+// NO__ERRNO: declare float @llvm.atan2.f32(float, float) 
[[READNONE_INTRINSIC]]
+// NO__ERRNO: declare x86_fp80 @llvm.atan2.f80(x86_fp80, x86_fp80) 
[[READNONE_INTRINSIC]]
+// NO__ERRNO: declare fp128 @llvm.atan2.f128(fp128, fp128) 
[[READNONE_INTRINSIC]]
 // HAS_ERRNO: declare double @atan2(double noundef, double noundef) 
[[NOT_READNONE]]
 // HAS_ERRNO: declare float @atan2f(float noundef, float noundef) 
[[NOT_READNONE]]
 // HAS_ERRNO: declare x86_fp80 @atan2l(x86_fp80 noundef, x86_fp80 noundef) 
[[NOT_READNONE]]
@@ -56,7 +56,7 @@ void foo(double *d, float f, float *fp, long double *l, int 
*i, const char *c) {
 
   __builtin_copysign(f,f); __builtin_copysignf(f,f); __builtin_copysignl(f,f); 
__builtin_copysignf128(f,f);
 
-// NO__ERRNO: declare double @llvm.copysign.f64(double, double) 
[[READNONE_INTRINSIC:#[0-9]+]]
+// NO__ERRNO: declare double @llvm.copysign.f64(double, double) 
[[READNONE_INTRINSIC]]
 // NO__ERRNO: declare float @llvm.copysign.f32(float, float) 
[[READNONE_INTRINSIC]]
 // NO__ERRNO: declare x86_fp80 @llvm.copysign.f80(x86_fp80, x86_fp80) 
[[READNONE_INTRINSIC]]
 // NO__ERRNO: declare fp128 @llvm.copysign.f128(fp128, fp128) 
[[READNONE_INTRINSIC]]
@@ -179,7 +179,7 @@ void foo(double *d, float f, float *fp, long double *l, int 
*i, const char *c) {
 
   __builtin_acosh(f

[clang] Emit constrained atan2 intrinsic for clang builtin (PR #113636)

2024-11-11 Thread Tex Riddell via cfe-commits


@@ -13,7 +13,7 @@ using size_t = unsigned long;
 #endif // SPIR
 } // namespace std
 
-float __builtin_atan2f(float, float);
+float __builtin_erff(float);

tex3d wrote:

Right, I hope I captured that in the PR description above.  If that's not clear 
enough let me know.

https://github.com/llvm/llvm-project/pull/113636
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Emit constrained atan2 intrinsic for clang builtin (PR #113636)

2024-11-11 Thread Tex Riddell via cfe-commits

https://github.com/tex3d updated 
https://github.com/llvm/llvm-project/pull/113636

>From a6776121bb118fe4083ccb94fa582cca1aef7f9b Mon Sep 17 00:00:00 2001
From: Tex Riddell 
Date: Tue, 15 Oct 2024 16:18:44 -0700
Subject: [PATCH 1/6] Emit constrained atan2 intrinsic for clang builtin

This change is part of this proposal: 
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

- `Builtins.td` - Add f16 support for libm atan2 builtin
- `CGBuiltin.cpp` - Emit constraint atan2 intrinsic for clang builtin

Part of Implement the atan2 HLSL Function #70096.
---
 clang/include/clang/Basic/Builtins.td |  6 +++---
 clang/lib/CodeGen/CGBuiltin.cpp   | 11 ++
 clang/test/CodeGen/X86/math-builtins.c| 14 ++---
 .../test/CodeGen/constrained-math-builtins.c  |  7 +++
 clang/test/CodeGen/libcalls.c |  7 +++
 clang/test/CodeGen/math-libcalls.c| 20 +--
 .../test/CodeGenCXX/builtin-calling-conv.cpp  | 10 +-
 clang/test/CodeGenOpenCL/builtins-f16.cl  |  3 +++
 8 files changed, 49 insertions(+), 29 deletions(-)

diff --git a/clang/include/clang/Basic/Builtins.td 
b/clang/include/clang/Basic/Builtins.td
index 87a798183d6e19..305b085f69420a 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -227,10 +227,10 @@ def FminimumNumF16F128 : Builtin, F16F128MathTemplate {
   let Prototype = "T(T, T)";
 }
 
-def Atan2F128 : Builtin {
-  let Spellings = ["__builtin_atan2f128"];
+def Atan2F16F128 : Builtin, F16F128MathTemplate {
+  let Spellings = ["__builtin_atan2"];
   let Attributes = [FunctionWithBuiltinPrefix, NoThrow, 
ConstIgnoringErrnoAndExceptions];
-  let Prototype = "__float128(__float128, __float128)";
+  let Prototype = "T(T, T)";
 }
 
 def CopysignF16 : Builtin {
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 5c3df5124517d6..9b63fcbedc8c45 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -2798,6 +2798,17 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl 
GD, unsigned BuiltinID,
   return RValue::get(emitUnaryMaybeConstrainedFPBuiltin(
   *this, E, Intrinsic::atan, 
Intrinsic::experimental_constrained_atan));
 
+case Builtin::BIatan2:
+case Builtin::BIatan2f:
+case Builtin::BIatan2l:
+case Builtin::BI__builtin_atan2:
+case Builtin::BI__builtin_atan2f:
+case Builtin::BI__builtin_atan2f16:
+case Builtin::BI__builtin_atan2l:
+case Builtin::BI__builtin_atan2f128:
+  return RValue::get(emitBinaryMaybeConstrainedFPBuiltin(
+  *this, E, Intrinsic::atan2, 
Intrinsic::experimental_constrained_atan2));
+
 case Builtin::BIceil:
 case Builtin::BIceilf:
 case Builtin::BIceill:
diff --git a/clang/test/CodeGen/X86/math-builtins.c 
b/clang/test/CodeGen/X86/math-builtins.c
index 48465df21cca19..bf107437fc63a3 100644
--- a/clang/test/CodeGen/X86/math-builtins.c
+++ b/clang/test/CodeGen/X86/math-builtins.c
@@ -45,10 +45,10 @@ void foo(double *d, float f, float *fp, long double *l, int 
*i, const char *c) {
 
   __builtin_atan2(f,f);__builtin_atan2f(f,f) ;  __builtin_atan2l(f, f); 
__builtin_atan2f128(f,f);
 
-// NO__ERRNO: declare double @atan2(double noundef, double noundef) 
[[READNONE:#[0-9]+]]
-// NO__ERRNO: declare float @atan2f(float noundef, float noundef) [[READNONE]]
-// NO__ERRNO: declare x86_fp80 @atan2l(x86_fp80 noundef, x86_fp80 noundef) 
[[READNONE]]
-// NO__ERRNO: declare fp128 @atan2f128(fp128 noundef, fp128 noundef) 
[[READNONE]]
+// NO__ERRNO: declare double @llvm.atan2.f64(double, double) 
[[READNONE_INTRINSIC:#[0-9]+]]
+// NO__ERRNO: declare float @llvm.atan2.f32(float, float) 
[[READNONE_INTRINSIC]]
+// NO__ERRNO: declare x86_fp80 @llvm.atan2.f80(x86_fp80, x86_fp80) 
[[READNONE_INTRINSIC]]
+// NO__ERRNO: declare fp128 @llvm.atan2.f128(fp128, fp128) 
[[READNONE_INTRINSIC]]
 // HAS_ERRNO: declare double @atan2(double noundef, double noundef) 
[[NOT_READNONE]]
 // HAS_ERRNO: declare float @atan2f(float noundef, float noundef) 
[[NOT_READNONE]]
 // HAS_ERRNO: declare x86_fp80 @atan2l(x86_fp80 noundef, x86_fp80 noundef) 
[[NOT_READNONE]]
@@ -56,7 +56,7 @@ void foo(double *d, float f, float *fp, long double *l, int 
*i, const char *c) {
 
   __builtin_copysign(f,f); __builtin_copysignf(f,f); __builtin_copysignl(f,f); 
__builtin_copysignf128(f,f);
 
-// NO__ERRNO: declare double @llvm.copysign.f64(double, double) 
[[READNONE_INTRINSIC:#[0-9]+]]
+// NO__ERRNO: declare double @llvm.copysign.f64(double, double) 
[[READNONE_INTRINSIC]]
 // NO__ERRNO: declare float @llvm.copysign.f32(float, float) 
[[READNONE_INTRINSIC]]
 // NO__ERRNO: declare x86_fp80 @llvm.copysign.f80(x86_fp80, x86_fp80) 
[[READNONE_INTRINSIC]]
 // NO__ERRNO: declare fp128 @llvm.copysign.f128(fp128, fp128) 
[[READNONE_INTRINSIC]]
@@ -179,7 +179,7 @@ void foo(double *d, float f, float *fp, long double *l, int 
*i, const char *c) {
 
   __builtin_acosh(f

[clang] [llvm] Emit constrained atan2 intrinsic for clang builtin (PR #113636)

2024-11-11 Thread Tex Riddell via cfe-commits

https://github.com/tex3d updated 
https://github.com/llvm/llvm-project/pull/113636

>From 661bd4ceba1e60bc12e1e85bffc53edfd13f5494 Mon Sep 17 00:00:00 2001
From: Tex Riddell 
Date: Tue, 15 Oct 2024 16:18:44 -0700
Subject: [PATCH 1/6] Emit constrained atan2 intrinsic for clang builtin

This change is part of this proposal: 
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

- `Builtins.td` - Add f16 support for libm atan2 builtin
- `CGBuiltin.cpp` - Emit constraint atan2 intrinsic for clang builtin

Part of Implement the atan2 HLSL Function #70096.
---
 clang/include/clang/Basic/Builtins.td |  6 +++---
 clang/lib/CodeGen/CGBuiltin.cpp   | 11 ++
 clang/test/CodeGen/X86/math-builtins.c| 14 ++---
 .../test/CodeGen/constrained-math-builtins.c  |  7 +++
 clang/test/CodeGen/libcalls.c |  7 +++
 clang/test/CodeGen/math-libcalls.c| 20 +--
 .../test/CodeGenCXX/builtin-calling-conv.cpp  | 10 +-
 clang/test/CodeGenOpenCL/builtins-f16.cl  |  3 +++
 8 files changed, 49 insertions(+), 29 deletions(-)

diff --git a/clang/include/clang/Basic/Builtins.td 
b/clang/include/clang/Basic/Builtins.td
index 4360e0bf9840f1..e866605ac05c09 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -227,10 +227,10 @@ def FminimumNumF16F128 : Builtin, F16F128MathTemplate {
   let Prototype = "T(T, T)";
 }
 
-def Atan2F128 : Builtin {
-  let Spellings = ["__builtin_atan2f128"];
+def Atan2F16F128 : Builtin, F16F128MathTemplate {
+  let Spellings = ["__builtin_atan2"];
   let Attributes = [FunctionWithBuiltinPrefix, NoThrow, 
ConstIgnoringErrnoAndExceptions];
-  let Prototype = "__float128(__float128, __float128)";
+  let Prototype = "T(T, T)";
 }
 
 def CopysignF16 : Builtin {
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 430ac5626f89d7..eaae4fbf711c8d 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -2798,6 +2798,17 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl 
GD, unsigned BuiltinID,
   return RValue::get(emitUnaryMaybeConstrainedFPBuiltin(
   *this, E, Intrinsic::atan, 
Intrinsic::experimental_constrained_atan));
 
+case Builtin::BIatan2:
+case Builtin::BIatan2f:
+case Builtin::BIatan2l:
+case Builtin::BI__builtin_atan2:
+case Builtin::BI__builtin_atan2f:
+case Builtin::BI__builtin_atan2f16:
+case Builtin::BI__builtin_atan2l:
+case Builtin::BI__builtin_atan2f128:
+  return RValue::get(emitBinaryMaybeConstrainedFPBuiltin(
+  *this, E, Intrinsic::atan2, 
Intrinsic::experimental_constrained_atan2));
+
 case Builtin::BIceil:
 case Builtin::BIceilf:
 case Builtin::BIceill:
diff --git a/clang/test/CodeGen/X86/math-builtins.c 
b/clang/test/CodeGen/X86/math-builtins.c
index 48465df21cca19..bf107437fc63a3 100644
--- a/clang/test/CodeGen/X86/math-builtins.c
+++ b/clang/test/CodeGen/X86/math-builtins.c
@@ -45,10 +45,10 @@ void foo(double *d, float f, float *fp, long double *l, int 
*i, const char *c) {
 
   __builtin_atan2(f,f);__builtin_atan2f(f,f) ;  __builtin_atan2l(f, f); 
__builtin_atan2f128(f,f);
 
-// NO__ERRNO: declare double @atan2(double noundef, double noundef) 
[[READNONE:#[0-9]+]]
-// NO__ERRNO: declare float @atan2f(float noundef, float noundef) [[READNONE]]
-// NO__ERRNO: declare x86_fp80 @atan2l(x86_fp80 noundef, x86_fp80 noundef) 
[[READNONE]]
-// NO__ERRNO: declare fp128 @atan2f128(fp128 noundef, fp128 noundef) 
[[READNONE]]
+// NO__ERRNO: declare double @llvm.atan2.f64(double, double) 
[[READNONE_INTRINSIC:#[0-9]+]]
+// NO__ERRNO: declare float @llvm.atan2.f32(float, float) 
[[READNONE_INTRINSIC]]
+// NO__ERRNO: declare x86_fp80 @llvm.atan2.f80(x86_fp80, x86_fp80) 
[[READNONE_INTRINSIC]]
+// NO__ERRNO: declare fp128 @llvm.atan2.f128(fp128, fp128) 
[[READNONE_INTRINSIC]]
 // HAS_ERRNO: declare double @atan2(double noundef, double noundef) 
[[NOT_READNONE]]
 // HAS_ERRNO: declare float @atan2f(float noundef, float noundef) 
[[NOT_READNONE]]
 // HAS_ERRNO: declare x86_fp80 @atan2l(x86_fp80 noundef, x86_fp80 noundef) 
[[NOT_READNONE]]
@@ -56,7 +56,7 @@ void foo(double *d, float f, float *fp, long double *l, int 
*i, const char *c) {
 
   __builtin_copysign(f,f); __builtin_copysignf(f,f); __builtin_copysignl(f,f); 
__builtin_copysignf128(f,f);
 
-// NO__ERRNO: declare double @llvm.copysign.f64(double, double) 
[[READNONE_INTRINSIC:#[0-9]+]]
+// NO__ERRNO: declare double @llvm.copysign.f64(double, double) 
[[READNONE_INTRINSIC]]
 // NO__ERRNO: declare float @llvm.copysign.f32(float, float) 
[[READNONE_INTRINSIC]]
 // NO__ERRNO: declare x86_fp80 @llvm.copysign.f80(x86_fp80, x86_fp80) 
[[READNONE_INTRINSIC]]
 // NO__ERRNO: declare fp128 @llvm.copysign.f128(fp128, fp128) 
[[READNONE_INTRINSIC]]
@@ -179,7 +179,7 @@ void foo(double *d, float f, float *fp, long double *l, int 
*i, const char *c) {
 
   __builtin_acosh(f

[clang] [HLSL] Add RWStructuredBuffer definition to HLSLExternalSemaSource (PR #113477)

2024-10-23 Thread Tex Riddell via cfe-commits


@@ -17,7 +17,7 @@
 // EMPTY-NEXT: FinalAttr 0x{{[0-9A-Fa-f]+}} <> Implicit final
 
 // There should be no more occurrances of StructuredBuffer
-// EMPTY-NOT: StructuredBuffer
+// EMPTY-NOT: {{^\W}}StructuredBuffer

tex3d wrote:

This still doesn't look right.  Did you test this expression for a positive 
match before using here?  I tried that expression and it does not match what 
you're looking for.  I confirmed that this expression works though: 
`{{[^[:alnum:]]}}StructuredBuffer`.

https://github.com/llvm/llvm-project/pull/113477
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] Add RWStructuredBuffer definition to HLSLExternalSemaSource (PR #113477)

2024-10-23 Thread Tex Riddell via cfe-commits


@@ -17,7 +17,7 @@
 // EMPTY-NEXT: FinalAttr 0x{{[0-9A-Fa-f]+}} <> Implicit final
 
 // There should be no more occurrances of StructuredBuffer
-// EMPTY-NOT: StructuredBuffer
+// EMPTY-NOT: {{^\W}}StructuredBuffer

tex3d wrote:

By the way, my suggestion of `{{[^W]}}` was to match any character except `W`.  
`{{^\W}}` looks for `W` at the beginning of the line: `^` outside of character 
set matches the beginning of the line, `\W` is just going to match `W`, I 
think.  It doesn't suppose character classes starting with `\` such as `\W`.  
Instead, you have to use the ones like in my last comment.

Here's a source that I think matches what FileCheck supports pretty well:
https://en.wikibooks.org/wiki/Regular_Expressions/POSIX-Extended_Regular_Expressions

https://github.com/llvm/llvm-project/pull/113477
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-23 Thread Tex Riddell via cfe-commits


@@ -95,6 +99,144 @@ static void initializeAlloca(CodeGenFunction &CGF, 
AllocaInst *AI, Value *Size,
   I->addAnnotationMetadata("auto-init");
 }
 
+static Value *handleHlslSplitdouble(const CallExpr *E, CodeGenFunction *CGF) {
+  Value *Op0 = CGF->EmitScalarExpr(E->getArg(0));
+  const auto *OutArg1 = dyn_cast(E->getArg(1));
+  const auto *OutArg2 = dyn_cast(E->getArg(2));
+
+  CallArgList Args;
+  LValue Op1TmpLValue =
+  CGF->EmitHLSLOutArgExpr(OutArg1, Args, OutArg1->getType());
+  LValue Op2TmpLValue =
+  CGF->EmitHLSLOutArgExpr(OutArg2, Args, OutArg2->getType());
+
+  if (CGF->getTarget().getCXXABI().areArgsDestroyedLeftToRightInCallee())
+Args.reverseWritebacks();
+
+  auto EmitVectorCode =
+  [](Value *Op, CGBuilderTy *Builder,
+ FixedVectorType *DestTy) -> std::pair {
+Value *bitcast = Builder->CreateBitCast(Op, DestTy);
+
+SmallVector LowbitsIndex;
+SmallVector HighbitsIndex;
+
+for (unsigned int Idx = 0; Idx < DestTy->getNumElements(); Idx += 2) {
+  LowbitsIndex.push_back(Idx);
+  HighbitsIndex.push_back(Idx + 1);
+}
+
+Value *Arg0 = Builder->CreateShuffleVector(bitcast, LowbitsIndex);
+Value *Arg1 = Builder->CreateShuffleVector(bitcast, HighbitsIndex);
+
+return std::make_pair(Arg0, Arg1);
+  };
+
+  Value *LowBits = nullptr;
+  Value *HighBits = nullptr;
+
+  if (CGF->CGM.getTarget().getTriple().isDXIL()) {
+
+llvm::Type *RetElementTy = CGF->Int32Ty;
+if (auto *Op0VecTy = E->getArg(0)->getType()->getAs())
+  RetElementTy = llvm::VectorType::get(
+  CGF->Int32Ty, ElementCount::getFixed(Op0VecTy->getNumElements()));
+auto *RetTy = llvm::StructType::get(RetElementTy, RetElementTy);
+
+CallInst *CI = CGF->Builder.CreateIntrinsic(
+RetTy, Intrinsic::dx_splitdouble, {Op0}, nullptr, "hlsl.splitdouble");
+
+LowBits = CGF->Builder.CreateExtractValue(CI, 0);
+HighBits = CGF->Builder.CreateExtractValue(CI, 1);
+
+  } else {
+// For Non DXIL targets we generate the instructions.
+// TODO: This code accounts for known limitations in
+// SPIR-V and splitdouble. Such should be handled,
+// in a later compilation stage. After [issue link here]
+// is fixed, this shall be refactored.
+
+if (!Op0->getType()->isVectorTy()) {
+  FixedVectorType *DestTy = FixedVectorType::get(CGF->Int32Ty, 2);
+  Value *Bitcast = CGF->Builder.CreateBitCast(Op0, DestTy);
+
+  LowBits = CGF->Builder.CreateExtractElement(Bitcast, 0.0);
+  HighBits = CGF->Builder.CreateExtractElement(Bitcast, 1.0);
+} else {
+
+  const auto *TargTy = E->getArg(0)->getType()->getAs();
+
+  int NumElements = TargTy->getNumElements();
+
+  FixedVectorType *DestTy = FixedVectorType::get(CGF->Int32Ty, 4);
+
+  if (NumElements == 1) {
+FixedVectorType *DestTy = FixedVectorType::get(CGF->Int32Ty, 2);
+auto *Bitcast = CGF->Builder.CreateBitCast(Op0, DestTy);
+
+LowBits = CGF->Builder.CreateExtractElement(Bitcast, 0.0);
+HighBits = CGF->Builder.CreateExtractElement(Bitcast, 1.0);

tex3d wrote:

```suggestion
LowBits = CGF->Builder.CreateExtractElement(Bitcast, (uint64_t)0);
HighBits = CGF->Builder.CreateExtractElement(Bitcast, 1);
```

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-23 Thread Tex Riddell via cfe-commits


@@ -95,6 +99,144 @@ static void initializeAlloca(CodeGenFunction &CGF, 
AllocaInst *AI, Value *Size,
   I->addAnnotationMetadata("auto-init");
 }
 
+static Value *handleHlslSplitdouble(const CallExpr *E, CodeGenFunction *CGF) {
+  Value *Op0 = CGF->EmitScalarExpr(E->getArg(0));
+  const auto *OutArg1 = dyn_cast(E->getArg(1));
+  const auto *OutArg2 = dyn_cast(E->getArg(2));
+
+  CallArgList Args;
+  LValue Op1TmpLValue =
+  CGF->EmitHLSLOutArgExpr(OutArg1, Args, OutArg1->getType());
+  LValue Op2TmpLValue =
+  CGF->EmitHLSLOutArgExpr(OutArg2, Args, OutArg2->getType());
+
+  if (CGF->getTarget().getCXXABI().areArgsDestroyedLeftToRightInCallee())
+Args.reverseWritebacks();
+
+  auto EmitVectorCode =
+  [](Value *Op, CGBuilderTy *Builder,
+ FixedVectorType *DestTy) -> std::pair {
+Value *bitcast = Builder->CreateBitCast(Op, DestTy);
+
+SmallVector LowbitsIndex;
+SmallVector HighbitsIndex;
+
+for (unsigned int Idx = 0; Idx < DestTy->getNumElements(); Idx += 2) {
+  LowbitsIndex.push_back(Idx);
+  HighbitsIndex.push_back(Idx + 1);
+}
+
+Value *Arg0 = Builder->CreateShuffleVector(bitcast, LowbitsIndex);
+Value *Arg1 = Builder->CreateShuffleVector(bitcast, HighbitsIndex);
+
+return std::make_pair(Arg0, Arg1);
+  };
+
+  Value *LowBits = nullptr;
+  Value *HighBits = nullptr;
+
+  if (CGF->CGM.getTarget().getTriple().isDXIL()) {
+
+llvm::Type *RetElementTy = CGF->Int32Ty;
+if (auto *Op0VecTy = E->getArg(0)->getType()->getAs())
+  RetElementTy = llvm::VectorType::get(
+  CGF->Int32Ty, ElementCount::getFixed(Op0VecTy->getNumElements()));
+auto *RetTy = llvm::StructType::get(RetElementTy, RetElementTy);
+
+CallInst *CI = CGF->Builder.CreateIntrinsic(
+RetTy, Intrinsic::dx_splitdouble, {Op0}, nullptr, "hlsl.splitdouble");
+
+LowBits = CGF->Builder.CreateExtractValue(CI, 0);
+HighBits = CGF->Builder.CreateExtractValue(CI, 1);
+
+  } else {
+// For Non DXIL targets we generate the instructions.
+// TODO: This code accounts for known limitations in
+// SPIR-V and splitdouble. Such should be handled,
+// in a later compilation stage. After [issue link here]
+// is fixed, this shall be refactored.
+
+if (!Op0->getType()->isVectorTy()) {
+  FixedVectorType *DestTy = FixedVectorType::get(CGF->Int32Ty, 2);
+  Value *Bitcast = CGF->Builder.CreateBitCast(Op0, DestTy);
+
+  LowBits = CGF->Builder.CreateExtractElement(Bitcast, 0.0);
+  HighBits = CGF->Builder.CreateExtractElement(Bitcast, 1.0);
+} else {
+
+  const auto *TargTy = E->getArg(0)->getType()->getAs();
+
+  int NumElements = TargTy->getNumElements();
+
+  FixedVectorType *DestTy = FixedVectorType::get(CGF->Int32Ty, 4);
+
+  if (NumElements == 1) {
+FixedVectorType *DestTy = FixedVectorType::get(CGF->Int32Ty, 2);
+auto *Bitcast = CGF->Builder.CreateBitCast(Op0, DestTy);
+
+LowBits = CGF->Builder.CreateExtractElement(Bitcast, 0.0);
+HighBits = CGF->Builder.CreateExtractElement(Bitcast, 1.0);
+  } else if (NumElements == 2) {
+auto [LB, HB] = EmitVectorCode(Op0, &CGF->Builder, DestTy);
+LowBits = LB;
+HighBits = HB;
+  } else {
+
+SmallVector> EmitedValuePairs;
+
+int isOdd = NumElements % 2;
+int NumEvenElements = NumElements - isOdd;
+
+Value *FinalElementCast = nullptr;
+for (int It = 0; It < NumEvenElements; It += 2) {
+  auto Shuff = CGF->Builder.CreateShuffleVector(Op0, {It, It + 1});
+  std::pair ValuePair =
+  EmitVectorCode(Shuff, &CGF->Builder, DestTy);
+  EmitedValuePairs.push_back(ValuePair);
+}
+
+if (isOdd == 1) {
+  FixedVectorType *DestTy = FixedVectorType::get(CGF->Int32Ty, 2);
+  auto *EV = CGF->Builder.CreateExtractElement(Op0, NumElements - 1);
+  FinalElementCast = CGF->Builder.CreateBitCast(EV, DestTy);
+}
+
+SmallVector Index = {0, 1};
+
+auto lb = EmitedValuePairs[0].first;
+auto hb = EmitedValuePairs[0].second;
+
+int EvenSizedPairs = EmitedValuePairs.size() - isOdd;
+
+for (int It = 1; It < EvenSizedPairs; It++) {
+  int CurIndexSize = Index.size();
+  Index.insert(Index.end(), {CurIndexSize, CurIndexSize + 1});
+  lb = CGF->Builder.CreateShuffleVector(lb, EmitedValuePairs[It].first,
+Index);
+  hb = CGF->Builder.CreateShuffleVector(hb, 
EmitedValuePairs[It].second,
+Index);
+}
+
+if (FinalElementCast) {
+  int CurIndexSize = Index.size();
+
+  Index.insert(Index.end(), {CurIndexSize});
+
+  lb = CGF->Builder.CreateShuffleVector(lb, FinalElementCast, Index);
+  hb = CGF->Builder.CreateShuffleVector(hb, FinalElementCast, Index);

tex3d wrote:

This is w

[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-23 Thread Tex Riddell via cfe-commits


@@ -95,6 +99,144 @@ static void initializeAlloca(CodeGenFunction &CGF, 
AllocaInst *AI, Value *Size,
   I->addAnnotationMetadata("auto-init");
 }
 
+static Value *handleHlslSplitdouble(const CallExpr *E, CodeGenFunction *CGF) {
+  Value *Op0 = CGF->EmitScalarExpr(E->getArg(0));
+  const auto *OutArg1 = dyn_cast(E->getArg(1));
+  const auto *OutArg2 = dyn_cast(E->getArg(2));
+
+  CallArgList Args;
+  LValue Op1TmpLValue =
+  CGF->EmitHLSLOutArgExpr(OutArg1, Args, OutArg1->getType());
+  LValue Op2TmpLValue =
+  CGF->EmitHLSLOutArgExpr(OutArg2, Args, OutArg2->getType());
+
+  if (CGF->getTarget().getCXXABI().areArgsDestroyedLeftToRightInCallee())
+Args.reverseWritebacks();
+
+  auto EmitVectorCode =
+  [](Value *Op, CGBuilderTy *Builder,
+ FixedVectorType *DestTy) -> std::pair {
+Value *bitcast = Builder->CreateBitCast(Op, DestTy);
+
+SmallVector LowbitsIndex;
+SmallVector HighbitsIndex;
+
+for (unsigned int Idx = 0; Idx < DestTy->getNumElements(); Idx += 2) {
+  LowbitsIndex.push_back(Idx);
+  HighbitsIndex.push_back(Idx + 1);
+}
+
+Value *Arg0 = Builder->CreateShuffleVector(bitcast, LowbitsIndex);
+Value *Arg1 = Builder->CreateShuffleVector(bitcast, HighbitsIndex);
+
+return std::make_pair(Arg0, Arg1);
+  };
+
+  Value *LowBits = nullptr;
+  Value *HighBits = nullptr;
+
+  if (CGF->CGM.getTarget().getTriple().isDXIL()) {
+
+llvm::Type *RetElementTy = CGF->Int32Ty;
+if (auto *Op0VecTy = E->getArg(0)->getType()->getAs())
+  RetElementTy = llvm::VectorType::get(
+  CGF->Int32Ty, ElementCount::getFixed(Op0VecTy->getNumElements()));
+auto *RetTy = llvm::StructType::get(RetElementTy, RetElementTy);
+
+CallInst *CI = CGF->Builder.CreateIntrinsic(
+RetTy, Intrinsic::dx_splitdouble, {Op0}, nullptr, "hlsl.splitdouble");
+
+LowBits = CGF->Builder.CreateExtractValue(CI, 0);
+HighBits = CGF->Builder.CreateExtractValue(CI, 1);
+
+  } else {
+// For Non DXIL targets we generate the instructions.
+// TODO: This code accounts for known limitations in
+// SPIR-V and splitdouble. Such should be handled,
+// in a later compilation stage. After [issue link here]
+// is fixed, this shall be refactored.
+
+if (!Op0->getType()->isVectorTy()) {
+  FixedVectorType *DestTy = FixedVectorType::get(CGF->Int32Ty, 2);
+  Value *Bitcast = CGF->Builder.CreateBitCast(Op0, DestTy);
+
+  LowBits = CGF->Builder.CreateExtractElement(Bitcast, 0.0);
+  HighBits = CGF->Builder.CreateExtractElement(Bitcast, 1.0);
+} else {
+
+  const auto *TargTy = E->getArg(0)->getType()->getAs();
+
+  int NumElements = TargTy->getNumElements();
+
+  FixedVectorType *DestTy = FixedVectorType::get(CGF->Int32Ty, 4);
+
+  if (NumElements == 1) {
+FixedVectorType *DestTy = FixedVectorType::get(CGF->Int32Ty, 2);
+auto *Bitcast = CGF->Builder.CreateBitCast(Op0, DestTy);
+
+LowBits = CGF->Builder.CreateExtractElement(Bitcast, 0.0);
+HighBits = CGF->Builder.CreateExtractElement(Bitcast, 1.0);
+  } else if (NumElements == 2) {
+auto [LB, HB] = EmitVectorCode(Op0, &CGF->Builder, DestTy);
+LowBits = LB;
+HighBits = HB;
+  } else {
+
+SmallVector> EmitedValuePairs;
+
+int isOdd = NumElements % 2;
+int NumEvenElements = NumElements - isOdd;
+
+Value *FinalElementCast = nullptr;
+for (int It = 0; It < NumEvenElements; It += 2) {
+  auto Shuff = CGF->Builder.CreateShuffleVector(Op0, {It, It + 1});
+  std::pair ValuePair =
+  EmitVectorCode(Shuff, &CGF->Builder, DestTy);
+  EmitedValuePairs.push_back(ValuePair);
+}
+
+if (isOdd == 1) {
+  FixedVectorType *DestTy = FixedVectorType::get(CGF->Int32Ty, 2);
+  auto *EV = CGF->Builder.CreateExtractElement(Op0, NumElements - 1);
+  FinalElementCast = CGF->Builder.CreateBitCast(EV, DestTy);
+}
+
+SmallVector Index = {0, 1};
+
+auto lb = EmitedValuePairs[0].first;
+auto hb = EmitedValuePairs[0].second;
+
+int EvenSizedPairs = EmitedValuePairs.size() - isOdd;
+
+for (int It = 1; It < EvenSizedPairs; It++) {

tex3d wrote:

This loop looks more general than it is.  It really only works for 
`EvenSizedPairs == 2` (in other words, vector size: 4).  I think that's 
misleading.

Overall, this non-DXIL code path looks like it's meant to handle any size 
vector, but would have multiple problems if you ever tried with a vector size 
larger than 4.  We will have larger vectors when we support matrix (at least, 
that's what I think they should resolve to).

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-23 Thread Tex Riddell via cfe-commits


@@ -95,6 +99,144 @@ static void initializeAlloca(CodeGenFunction &CGF, 
AllocaInst *AI, Value *Size,
   I->addAnnotationMetadata("auto-init");
 }
 
+static Value *handleHlslSplitdouble(const CallExpr *E, CodeGenFunction *CGF) {
+  Value *Op0 = CGF->EmitScalarExpr(E->getArg(0));
+  const auto *OutArg1 = dyn_cast(E->getArg(1));
+  const auto *OutArg2 = dyn_cast(E->getArg(2));
+
+  CallArgList Args;
+  LValue Op1TmpLValue =
+  CGF->EmitHLSLOutArgExpr(OutArg1, Args, OutArg1->getType());
+  LValue Op2TmpLValue =
+  CGF->EmitHLSLOutArgExpr(OutArg2, Args, OutArg2->getType());
+
+  if (CGF->getTarget().getCXXABI().areArgsDestroyedLeftToRightInCallee())
+Args.reverseWritebacks();
+
+  auto EmitVectorCode =
+  [](Value *Op, CGBuilderTy *Builder,
+ FixedVectorType *DestTy) -> std::pair {
+Value *bitcast = Builder->CreateBitCast(Op, DestTy);
+
+SmallVector LowbitsIndex;
+SmallVector HighbitsIndex;
+
+for (unsigned int Idx = 0; Idx < DestTy->getNumElements(); Idx += 2) {
+  LowbitsIndex.push_back(Idx);
+  HighbitsIndex.push_back(Idx + 1);
+}
+
+Value *Arg0 = Builder->CreateShuffleVector(bitcast, LowbitsIndex);
+Value *Arg1 = Builder->CreateShuffleVector(bitcast, HighbitsIndex);
+
+return std::make_pair(Arg0, Arg1);
+  };
+
+  Value *LowBits = nullptr;
+  Value *HighBits = nullptr;
+
+  if (CGF->CGM.getTarget().getTriple().isDXIL()) {
+
+llvm::Type *RetElementTy = CGF->Int32Ty;
+if (auto *Op0VecTy = E->getArg(0)->getType()->getAs())
+  RetElementTy = llvm::VectorType::get(
+  CGF->Int32Ty, ElementCount::getFixed(Op0VecTy->getNumElements()));
+auto *RetTy = llvm::StructType::get(RetElementTy, RetElementTy);
+
+CallInst *CI = CGF->Builder.CreateIntrinsic(
+RetTy, Intrinsic::dx_splitdouble, {Op0}, nullptr, "hlsl.splitdouble");
+
+LowBits = CGF->Builder.CreateExtractValue(CI, 0);
+HighBits = CGF->Builder.CreateExtractValue(CI, 1);
+
+  } else {
+// For Non DXIL targets we generate the instructions.
+// TODO: This code accounts for known limitations in
+// SPIR-V and splitdouble. Such should be handled,
+// in a later compilation stage. After [issue link here]
+// is fixed, this shall be refactored.
+
+if (!Op0->getType()->isVectorTy()) {
+  FixedVectorType *DestTy = FixedVectorType::get(CGF->Int32Ty, 2);
+  Value *Bitcast = CGF->Builder.CreateBitCast(Op0, DestTy);
+
+  LowBits = CGF->Builder.CreateExtractElement(Bitcast, 0.0);
+  HighBits = CGF->Builder.CreateExtractElement(Bitcast, 1.0);

tex3d wrote:

Why the floating point constant indices?  I suspect you found `0` to be 
ambiguous due to the overload with either `uint64_t` or `Value*`, but this is 
an odd way to trick it into implicitly casting to `uint64_t`.

You could use: `(uint64_t)0` (commonly used elsewhere, only required for `0`)
```suggestion
  LowBits = CGF->Builder.CreateExtractElement(Bitcast, (uint64_t)0);
  HighBits = CGF->Builder.CreateExtractElement(Bitcast, 1);
```
Or as some others do, you could explicitly get the constant `Value*` for `0` 
and `1`:
```cpp
  Value *Idx0 = ConstantInt::get(SizeTy, 0);
  Value *Idx1 = ConstantInt::get(SizeTy, 1);
```

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-23 Thread Tex Riddell via cfe-commits


@@ -95,6 +99,133 @@ static void initializeAlloca(CodeGenFunction &CGF, 
AllocaInst *AI, Value *Size,
   I->addAnnotationMetadata("auto-init");
 }
 
+static Value *handleHlslSplitdouble(const CallExpr *E, CodeGenFunction *CGF) {
+  Value *Op0 = CGF->EmitScalarExpr(E->getArg(0));
+  const auto *OutArg1 = dyn_cast(E->getArg(1));
+  const auto *OutArg2 = dyn_cast(E->getArg(2));
+
+  CallArgList Args;
+  LValue Op1TmpLValue =
+  CGF->EmitHLSLOutArgExpr(OutArg1, Args, OutArg1->getType());
+  LValue Op2TmpLValue =
+  CGF->EmitHLSLOutArgExpr(OutArg2, Args, OutArg2->getType());
+
+  if (CGF->getTarget().getCXXABI().areArgsDestroyedLeftToRightInCallee())
+Args.reverseWritebacks();
+
+  auto EmitVectorCode =
+  [](Value *Op, CGBuilderTy *Builder,
+ FixedVectorType *DestTy) -> std::pair {
+Value *bitcast = Builder->CreateBitCast(Op, DestTy);
+
+SmallVector LowbitsIndex;
+SmallVector HighbitsIndex;
+
+for (unsigned int Idx = 0; Idx < DestTy->getNumElements(); Idx += 2) {
+  LowbitsIndex.push_back(Idx);
+  HighbitsIndex.push_back(Idx + 1);
+}
+
+Value *Arg0 = Builder->CreateShuffleVector(bitcast, LowbitsIndex);
+Value *Arg1 = Builder->CreateShuffleVector(bitcast, HighbitsIndex);
+
+return std::make_pair(Arg0, Arg1);
+  };

tex3d wrote:

Since this lambda function is only used in the SPIR-V path, it could probably 
be defined in the else block of `if 
(CGF->CGM.getTarget().getTriple().isDXIL())`.

Also, this function looks more general than it is.  It can only handle one 
case, `DestTy` of `<4 x i32>`, since you have to do things differently for 
scalar double cast (using extractelement instead of shuffle).  Since that's the 
case, you don't need a general loop to construct indices, you can make the name 
more specific and clearer, like `EmitDouble2Cast`, and you don't need to supply 
a `DestTy`, since it's always the same cast type.

```suggestion
  // casts `<2 x double>` to `<4 x i32>`, then shuffles into high and low
  // `<2 x i32>` vectors.
  auto EmitDouble2Cast = [](CodeGenFunction &CGF,
Value *DoubleVec2) -> std::pair {
Value *BC = CGF.Builder.CreateBitCast(DoubleVec2,
  FixedVectorType::get(CGF.Int32Ty, 4));
Value *LB = CGF.Builder.CreateShuffleVector(BC, {0, 2});
Value *HB = CGF.Builder.CreateShuffleVector(BC, {1, 3});

return std::make_pair(LB, HB);
  };
```

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-23 Thread Tex Riddell via cfe-commits


@@ -0,0 +1,98 @@
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple 
dxil-pc-shadermodel6.3-library %s -fnative-half-type -emit-llvm -O1 -o - | 
FileCheck %s
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple 
spirv-vulkan-library %s -fnative-half-type -emit-llvm -O0 -o - | FileCheck %s 
--check-prefix=SPIRV
+
+
+
+// CHECK: define {{.*}} i32 {{.*}}test_scalar{{.*}}(double {{.*}} [[VALD:%.*]])
+// CHECK: [[VALRET:%.*]] = {{.*}} call { i32, i32 } 
@llvm.dx.splitdouble.i32(double [[VALD]])
+// CHECK-NEXT: extractvalue { i32, i32 } [[VALRET]], 0
+// CHECK-NEXT: extractvalue { i32, i32 } [[VALRET]], 1
+// SPIRV: define spir_func {{.*}} i32 {{.*}}test_scalar{{.*}}(double {{.*}} 
[[VALD:%.*]])
+// SPIRV-NOT: @llvm.dx.splitdouble.i32
+// SPIRV: [[REG:%.*]] = load double, ptr [[VALD]].addr, align 8
+// SPIRV-NEXT: [[CAST:%.*]] = bitcast double [[REG]] to <2 x i32>
+// SPIRV-NEXT: extractelement <2 x i32> [[CAST]], i64 0
+// SPIRV-NEXT: extractelement <2 x i32> [[CAST]], i64 1
+uint test_scalar(double D) {
+  uint A, B;
+  asuint(D, A, B);
+  return A + B;
+}
+
+// CHECK: define {{.*}} i32 {{.*}}test_double1{{.*}}(<1 x double> {{.*}} 
[[VALD:%.*]])
+// CHECK: [[TRUNC:%.*]] = extractelement <1 x double> %D, i64 0
+// CHECK-NEXT: [[VALRET:%.*]] = {{.*}} call { i32, i32 } 
@llvm.dx.splitdouble.i32(double [[TRUNC]])
+// CHECK-NEXT: extractvalue { i32, i32 } [[VALRET]], 0
+// CHECK-NEXT: extractvalue { i32, i32 } [[VALRET]], 1
+// SPIRV: define spir_func {{.*}} i32 {{.*}}test_double1{{.*}}(<1 x double> 
{{.*}} [[VALD:%.*]])
+// SPIRV-NOT: @llvm.dx.splitdouble.i32
+// SPIRV: [[REG:%.*]] = load <1 x double>, ptr [[VALD]].addr, align 8
+// SPIRV-NEXT: [[TRUNC:%.*]] = extractelement <1 x double> %1, i64 0

tex3d wrote:

@farzonl I wasn't suggesting that we run `spirv-val` in *this* test.  I was 
suggesting that this generated IR should also be tested in the other test 
(`llvm/test/CodeGen/SPIRV/hlsl-intrinsics/splitdouble.ll`) where we test how 
the generated IR patterns lower to SPIR-V and *do* run `spirv-val`.

In reality, I'm expecting that the vector size 1 case is problematic for SPIR-V 
target. I still consider this comment resolved because CGBuiltin is doing it's 
best to operate with scalars, but the incoming type is `<1 x double>` which 
will generate an illegal type in SPIR-V.

Ultimately, I think we need to sort out our vector size 1 plan more thoroughly, 
rather than generate and test something speculative that won't lower to legal 
code, and/or won't be how we want to represent vector1 in IR in the first place.

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-24 Thread Tex Riddell via cfe-commits


@@ -0,0 +1,38 @@
+// RUN: %clang_cc1 -finclude-default-header -triple 
dxil-pc-shadermodel6.6-library %s -fnative-half-type -verify
+
+void test_no_second_arg(double D) {
+  __builtin_hlsl_elementwise_splitdouble(D);
+ // expected-error@-1 {{too few arguments to function call, expected 3, have 
1}} 
+}
+
+void test_no_third_arg(double D) {
+  uint A;
+  __builtin_hlsl_elementwise_splitdouble(D, A);
+ // expected-error@-1 {{too few arguments to function call, expected 3, have 
2}} 
+}
+
+void test_too_many_arg(double D) {
+  uint A, B, C;
+  __builtin_hlsl_elementwise_splitdouble(D, A, B, C);
+ // expected-error@-1 {{too many arguments to function call, expected 3, have 
4}} 
+}
+
+void test_first_arg_type_mismatch(bool3 D) {
+  uint3 A, B;
+  __builtin_hlsl_elementwise_splitdouble(D, A, B);
+ // expected-error@-1 {{invalid operand of type 'bool3' (aka 'vector') where 'double' or a vector of such type is required}} 
+}
+
+void test_second_arg_type_mismatch(double D) {
+  bool A;
+  uint B;
+  __builtin_hlsl_elementwise_splitdouble(D, A, B);
+ // expected-error@-1 {{invalid operand of type 'bool' where 'unsigned int' or 
a vector of such type is required}} 
+}
+
+void test_third_arg_type_mismatch(double D) {
+  bool A;
+  uint B;
+  __builtin_hlsl_elementwise_splitdouble(D, B, A);
+ // expected-error@-1 {{invalid operand of type 'bool' where 'unsigned int' or 
a vector of such type is required}} 

tex3d wrote:

Sorry for the late added comment, but it just came to mind: most intrinsics 
have only input arguments, but this has two output arguments.  We don't have 
any tests for cases where the output argument is `const` or not an L-Value 
reference, such that it cannot be written to as an output.  Do we handle that 
gracefully?  That seems like it would be good to try.

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] Add RWStructuredBuffer definition to HLSLExternalSemaSource (PR #113477)

2024-10-23 Thread Tex Riddell via cfe-commits

https://github.com/tex3d approved this pull request.

Looks good.

https://github.com/llvm/llvm-project/pull/113477
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-23 Thread Tex Riddell via cfe-commits


@@ -0,0 +1,54 @@
+; RUN: llc -verify-machineinstrs -O0 -mtriple=spirv-unknown-unknown %s -o - | 
FileCheck %s
+; RUN: %if spirv-tools %{ llc -O0 -mtriple=spirv-unknown-unknown %s -o - 
-filetype=obj | spirv-val %}
+
+; Make sure lowering is correctly generating spirv code.
+
+; CHECK-DAG: %[[#double:]] = OpTypeFloat 64
+; CHECK-DAG: %[[#int_32:]] = OpTypeInt 32 0
+; CHECK-DAG: %[[#scalar_function:]] = OpTypeFunction %[[#int_32]] %[[#double]]
+; CHECK-DAG: %[[#vec_2_int_32:]] = OpTypeVector %[[#int_32]] 2
+; CHECK-DAG: %[[#vec_4_int_32:]] = OpTypeVector %[[#int_32]] 4
+; CHECK-DAG: %[[#vec_3_int_32:]] = OpTypeVector %[[#int_32]] 3
+; CHECK-DAG: %[[#vec_3_double:]] = OpTypeVector %[[#double]] 3
+; CHECK-DAG: %[[#vector_function:]] = OpTypeFunction %[[#vec_3_int_32]] 
%[[#vec_3_double]]
+; CHECK-DAG: %[[#vec_2_double:]] = OpTypeVector %[[#double]] 2
+
+
+define spir_func noundef i32 @test_scalar(double noundef %D) 
local_unnamed_addr {
+entry:
+  ; CHECK-LABEL: ; -- Begin function test_scalar
+  ; CHECK: %[[#param:]] = OpFunctionParameter %[[#double]]
+  ; CHECK: %[[#bitcast:]] = OpBitcast %[[#vec_2_int_32]] %[[#param]]
+  %0 = bitcast double %D to <2 x i32>
+  ; CHECK: %[[#]] = OpCompositeExtract %[[#int_32]] %[[#bitcast]] 0
+  %1 = extractelement <2 x i32> %0, i64 0
+  ; CHECK: %[[#]] = OpCompositeExtract %[[#int_32]] %[[#bitcast]] 1
+  %2 = extractelement <2 x i32> %0, i64 1
+  %add = add i32 %1, %2
+  ret i32 %add
+}
+
+
+define spir_func noundef <3 x i32> @test_vector(<3 x double> noundef %D) 
local_unnamed_addr {
+entry:
+  ; CHECK-LABEL: ; -- Begin function test_vector
+  ; CHECK: %[[#param:]] = OpFunctionParameter %[[#vec_3_double]]
+  ; %[[#SHUFF1:]] = OpVectorShuffle %[[#vec_2_double]] %[[#param]] %[[#]] 0 1
+  ; %[[#CAST1:]] = OpBitcast %[[#vec_4_int_32]] %[[#SHUFF1]]
+  ; %[[#SHUFF2:]] = OpVectorShuffle %[[#vec_2_int_32]] %[[#CAST1]] %[[#]] 0 2
+  ; %[[#SHUFF3:]] = OpVectorShuffle %[[#vec_2_int_32]] %[[#CAST1]] %[[#]] 1 3
+  ; %[[#EXTRACT:]] = OpCompositeExtract %[[#double]] %[[#param]] 2
+  ; %[[#CAST2:]] = OpBitcast %[[#vec_2_int_32]] %[[#EXTRACT]]
+  ; %[[#]] = OpVectorShuffle %7 %[[#SHUFF2]] %[[#CAST2]] 0 1 2
+  ; %[[#]] = OpVectorShuffle %7 %[[#SHUFF3]] %[[#CAST2]] 0 1 2
+  %0 = shufflevector <3 x double> %D, <3 x double> poison, <2 x i32> 
+  %1 = bitcast <2 x double> %0 to <4 x i32>
+  %2 = shufflevector <4 x i32> %1, <4 x i32> poison, <2 x i32> 
+  %3 = shufflevector <4 x i32> %1, <4 x i32> poison, <2 x i32> 
+  %4 = extractelement <3 x double> %D, i64 2
+  %5 = bitcast double %4 to <2 x i32>
+  %6 = shufflevector <2 x i32> %2, <2 x i32> %5, <3 x i32> 
+  %7 = shufflevector <2 x i32> %3, <2 x i32> %5, <3 x i32> 

tex3d wrote:

This IR has the same fault as pointed out in the checks 
[here](https://github.com/llvm/llvm-project/pull/109331/files#r1813586399).

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-23 Thread Tex Riddell via cfe-commits


@@ -0,0 +1,98 @@
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple 
dxil-pc-shadermodel6.3-library %s -fnative-half-type -emit-llvm -O1 -o - | 
FileCheck %s
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple 
spirv-vulkan-library %s -fnative-half-type -emit-llvm -O0 -o - | FileCheck %s 
--check-prefix=SPIRV
+
+
+
+// CHECK: define {{.*}} i32 {{.*}}test_scalar{{.*}}(double {{.*}} [[VALD:%.*]])
+// CHECK: [[VALRET:%.*]] = {{.*}} call { i32, i32 } 
@llvm.dx.splitdouble.i32(double [[VALD]])
+// CHECK-NEXT: extractvalue { i32, i32 } [[VALRET]], 0
+// CHECK-NEXT: extractvalue { i32, i32 } [[VALRET]], 1
+// SPIRV: define spir_func {{.*}} i32 {{.*}}test_scalar{{.*}}(double {{.*}} 
[[VALD:%.*]])
+// SPIRV-NOT: @llvm.dx.splitdouble.i32
+// SPIRV: [[REG:%.*]] = load double, ptr [[VALD]].addr, align 8
+// SPIRV-NEXT: [[CAST:%.*]] = bitcast double [[REG]] to <2 x i32>
+// SPIRV-NEXT: extractelement <2 x i32> [[CAST]], i64 0
+// SPIRV-NEXT: extractelement <2 x i32> [[CAST]], i64 1
+uint test_scalar(double D) {
+  uint A, B;
+  asuint(D, A, B);
+  return A + B;
+}
+
+// CHECK: define {{.*}} i32 {{.*}}test_double1{{.*}}(<1 x double> {{.*}} 
[[VALD:%.*]])
+// CHECK: [[TRUNC:%.*]] = extractelement <1 x double> %D, i64 0
+// CHECK-NEXT: [[VALRET:%.*]] = {{.*}} call { i32, i32 } 
@llvm.dx.splitdouble.i32(double [[TRUNC]])
+// CHECK-NEXT: extractvalue { i32, i32 } [[VALRET]], 0
+// CHECK-NEXT: extractvalue { i32, i32 } [[VALRET]], 1
+// SPIRV: define spir_func {{.*}} i32 {{.*}}test_double1{{.*}}(<1 x double> 
{{.*}} [[VALD:%.*]])
+// SPIRV-NOT: @llvm.dx.splitdouble.i32
+// SPIRV: [[REG:%.*]] = load <1 x double>, ptr [[VALD]].addr, align 8
+// SPIRV-NEXT: [[TRUNC:%.*]] = extractelement <1 x double> %1, i64 0
+// SPIRV-NEXT: [[CAST:%.*]] = bitcast double [[TRUNC]] to <2 x i32>
+// SPIRV-NEXT: extractelement <2 x i32> [[CAST]], i64 0
+// SPIRV-NEXT: extractelement <2 x i32> [[CAST]], i64 1
+uint test_double1(double1 D) {
+  uint A, B;
+  asuint(D, A, B);
+  return A + B;
+}
+
+// CHECK: define {{.*}} <2 x i32> {{.*}}test_vector2{{.*}}(<2 x double> {{.*}} 
[[VALD:%.*]])
+// CHECK: [[VALRET:%.*]] = {{.*}} call { <2 x i32>, <2 x i32> } 
@llvm.dx.splitdouble.v2i32(<2 x double> [[VALD]])
+// CHECK-NEXT: extractvalue { <2 x i32>, <2 x i32> } [[VALRET]], 0
+// CHECK-NEXT: extractvalue { <2 x i32>, <2 x i32> } [[VALRET]], 1
+// SPIRV: define spir_func {{.*}} <2 x i32> {{.*}}test_vector2{{.*}}(<2 x 
double> {{.*}} [[VALD:%.*]])
+// SPIRV-NOT: @llvm.dx.splitdouble.i32
+// SPIRV: [[REG:%.*]] = load <2 x double>, ptr [[VALD]].addr, align 16
+// SPIRV-NEXT: [[CAST1:%.*]] = bitcast <2 x double> [[REG]] to <4 x i32>
+// SPIRV-NEXT: [[SHUF1:%.*]] = shufflevector <4 x i32> [[CAST1]], <4 x i32> 
poison, <2 x i32> 
+// SPIRV-NEXT: [[SHUF2:%.*]] = shufflevector <4 x i32> [[CAST1]], <4 x i32> 
poison, <2 x i32> 
+uint2 test_vector2(double2 D) {
+  uint2 A, B;
+  asuint(D, A, B);
+  return A + B;
+}
+
+// CHECK: define {{.*}} <3 x i32> {{.*}}test_vector3{{.*}}(<3 x double> {{.*}} 
[[VALD:%.*]])
+// CHECK: [[VALRET:%.*]] = {{.*}} call { <3 x i32>, <3 x i32> } 
@llvm.dx.splitdouble.v3i32(<3 x double> [[VALD]])
+// CHECK-NEXT: extractvalue { <3 x i32>, <3 x i32> } [[VALRET]], 0
+// CHECK-NEXT: extractvalue { <3 x i32>, <3 x i32> } [[VALRET]], 1
+// SPIRV: define spir_func {{.*}} <3 x i32> {{.*}}test_vector3{{.*}}(<3 x 
double> {{.*}} [[VALD:%.*]])
+// SPIRV-NOT: @llvm.dx.splitdouble.i32
+// SPIRV: [[REG:%.*]] = load <3 x double>, ptr [[VALD]].addr, align 32
+// SPIRV-NEXT: [[VALRET1:%.*]] = shufflevector <3 x double> [[REG]], <3 x 
double> poison, <2 x i32> 
+// SPIRV-NEXT: [[CAST1:%.*]] = bitcast <2 x double> [[VALRET1]] to <4 x i32>
+// SPIRV-NEXT: [[SHUF1:%.*]] = shufflevector <4 x i32> [[CAST1]], <4 x i32> 
poison, <2 x i32> 
+// SPIRV-NEXT: [[SHUF2:%.*]] = shufflevector <4 x i32> [[CAST1]], <4 x i32> 
poison, <2 x i32> 
+// SPIRV-NEXT: [[EXTRACT:%.*]] = extractelement <3 x double> [[REG]], i64 2
+// SPIRV-NEXT: [[CAST:%.*]] = bitcast double [[EXTRACT]] to <2 x i32>
+// SPIRV-NEXT: %[[#]] = shufflevector <2 x i32> [[SHUF1]], <2 x i32> [[CAST]], 
<3 x i32> 
+// SPIRV-NEXT: %[[#]] = shufflevector <2 x i32> [[SHUF2]], <2 x i32> [[CAST]], 
<3 x i32> 

tex3d wrote:

This shuffle does not look correct.  It should be selecting the second value 
from the second vector.  It's confusing because the first vector has already 
been shuffled into separate high/low vectors, but the second vector contains 
one high and one low component that needs indexing differently between these 
last two shuffles.
```suggestion
// SPIRV-NEXT: %[[#]] = shufflevector <2 x i32> [[SHUF2]], <2 x i32> [[CAST]], 
<3 x i32> 
```

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-23 Thread Tex Riddell via cfe-commits


@@ -0,0 +1,98 @@
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple 
dxil-pc-shadermodel6.3-library %s -fnative-half-type -emit-llvm -O1 -o - | 
FileCheck %s
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple 
spirv-vulkan-library %s -fnative-half-type -emit-llvm -O0 -o - | FileCheck %s 
--check-prefix=SPIRV
+
+
+
+// CHECK: define {{.*}} i32 {{.*}}test_scalar{{.*}}(double {{.*}} [[VALD:%.*]])
+// CHECK: [[VALRET:%.*]] = {{.*}} call { i32, i32 } 
@llvm.dx.splitdouble.i32(double [[VALD]])
+// CHECK-NEXT: extractvalue { i32, i32 } [[VALRET]], 0
+// CHECK-NEXT: extractvalue { i32, i32 } [[VALRET]], 1
+// SPIRV: define spir_func {{.*}} i32 {{.*}}test_scalar{{.*}}(double {{.*}} 
[[VALD:%.*]])
+// SPIRV-NOT: @llvm.dx.splitdouble.i32
+// SPIRV: [[REG:%.*]] = load double, ptr [[VALD]].addr, align 8
+// SPIRV-NEXT: [[CAST:%.*]] = bitcast double [[REG]] to <2 x i32>
+// SPIRV-NEXT: extractelement <2 x i32> [[CAST]], i64 0
+// SPIRV-NEXT: extractelement <2 x i32> [[CAST]], i64 1
+uint test_scalar(double D) {
+  uint A, B;
+  asuint(D, A, B);
+  return A + B;
+}
+
+// CHECK: define {{.*}} i32 {{.*}}test_double1{{.*}}(<1 x double> {{.*}} 
[[VALD:%.*]])
+// CHECK: [[TRUNC:%.*]] = extractelement <1 x double> %D, i64 0
+// CHECK-NEXT: [[VALRET:%.*]] = {{.*}} call { i32, i32 } 
@llvm.dx.splitdouble.i32(double [[TRUNC]])
+// CHECK-NEXT: extractvalue { i32, i32 } [[VALRET]], 0
+// CHECK-NEXT: extractvalue { i32, i32 } [[VALRET]], 1
+// SPIRV: define spir_func {{.*}} i32 {{.*}}test_double1{{.*}}(<1 x double> 
{{.*}} [[VALD:%.*]])
+// SPIRV-NOT: @llvm.dx.splitdouble.i32
+// SPIRV: [[REG:%.*]] = load <1 x double>, ptr [[VALD]].addr, align 8
+// SPIRV-NEXT: [[TRUNC:%.*]] = extractelement <1 x double> %1, i64 0

tex3d wrote:

I suspect the `<1 x double>` types here will break SPIR-V lowering.  I suggest 
copying the resulting IR into the 
`llvm/test/CodeGen/SPIRV/hlsl-intrinsics/splitdouble.ll` test to see if it 
generates SPIR-V that passes `spirv-val`.

We should probably have a discussion on what to do with `<1 x type>` vectors 
when going from HLSL to llvm IR.  I suspect we should alias these to scalar 
instead of preserving the vector type.

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-23 Thread Tex Riddell via cfe-commits


@@ -0,0 +1,99 @@
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple 
dxil-pc-shadermodel6.3-library %s -fnative-half-type -emit-llvm -O1 -o - | 
FileCheck %s
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple 
spirv-vulkan-library %s -fnative-half-type -emit-llvm -O0 -o - | FileCheck %s 
--check-prefix=SPIRV
+
+
+
+// CHECK: define {{.*}} i32 {{.*}}test_scalar{{.*}}(double {{.*}} [[VALD:%.*]])
+// CHECK: [[VALRET:%.*]] = {{.*}} call { i32, i32 } 
@llvm.dx.splitdouble.i32(double [[VALD]])
+// CHECK-NEXT: extractvalue { i32, i32 } [[VALRET]], 0
+// CHECK-NEXT: extractvalue { i32, i32 } [[VALRET]], 1
+// SPIRV: define spir_func {{.*}} i32 {{.*}}test_scalar{{.*}}(double {{.*}} 
[[VALD:%.*]])
+// SPIRV-NOT: @llvm.dx.splitdouble.i32
+// SPIRV: [[REG:%.*]] = load double, ptr [[VALD]].addr, align 8
+// SPIRV-NEXT: [[CAST:%.*]] = bitcast double [[REG]] to <2 x i32>
+// SPIRV-NEXT: extractelement <2 x i32> [[CAST]], i64 0
+// SPIRV-NEXT: extractelement <2 x i32> [[CAST]], i64 1
+uint test_scalar(double D) {
+  uint A, B;
+  asuint(D, A, B);
+  return A + B;
+}
+
+// CHECK: define {{.*}} i32 {{.*}}test_double1{{.*}}(<1 x double> {{.*}} 
[[VALD:%.*]])
+// CHECK: [[TRUNC:%.*]] = extractelement <1 x double> %D, i64 0
+// CHECK-NEXT: [[VALRET:%.*]] = {{.*}} call { i32, i32 } 
@llvm.dx.splitdouble.i32(double [[TRUNC]])
+// CHECK-NEXT: extractvalue { i32, i32 } [[VALRET]], 0
+// CHECK-NEXT: extractvalue { i32, i32 } [[VALRET]], 1
+// SPIRV: define spir_func {{.*}} i32 {{.*}}test_double1{{.*}}(<1 x double> 
{{.*}} [[VALD:%.*]])
+// SPIRV-NOT: @llvm.dx.splitdouble.i32
+// SPIRV: [[REG:%.*]] = load <1 x double>, ptr [[VALD]].addr, align 8
+// SPIRV-NEXT: [[TRUNC:%.*]] = extractelement <1 x double> %1, i64 0
+// SPIRV-NEXT: [[CAST:%.*]] = bitcast double [[TRUNC]] to <2 x i32>
+// SPIRV-NEXT: extractelement <2 x i32> [[CAST]], i64 0
+// SPIRV-NEXT: extractelement <2 x i32> [[CAST]], i64 1
+uint test_double1(double1 D) {
+  uint A, B;

tex3d wrote:

I think these should be vectors to match `D` and be sure the vector1 case is 
triggered, rather than the vector1 being implicitly cast to scalar and 
triggering the scalar expansion:
```suggestion
  uint1 A, B;
```

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Emit constrained atan2 intrinsic for clang builtin (PR #113636)

2024-11-12 Thread Tex Riddell via cfe-commits

https://github.com/tex3d closed https://github.com/llvm/llvm-project/pull/113636
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Emit constrained atan2 intrinsic for clang builtin (PR #113636)

2024-11-12 Thread Tex Riddell via cfe-commits

https://github.com/tex3d edited https://github.com/llvm/llvm-project/pull/113636
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Add CHECK-LABEL to avoid source tree path sensitivity in test (PR #112461)

2024-10-31 Thread Tex Riddell via cfe-commits

https://github.com/tex3d closed https://github.com/llvm/llvm-project/pull/112461
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL][clang] Add elementwise builtin for atan2 (p3) (PR #110187)

2024-09-26 Thread Tex Riddell via cfe-commits

https://github.com/tex3d created 
https://github.com/llvm/llvm-project/pull/110187

This change is part of this proposal: 
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

- Add HLSL frontend for atan2
- Add clang Builtin, map to new llvm.atan2
- SemaChecking restrict to floating point and 2 args
- SemaHLSL restrict to float or half.
- Add to clang ReleaseNotes.rst and LanguageExtensions.rst

Part 3 for Implement the atan2 HLSL Function #70096.

>From 8664e3db8da3769514dff65a421101b0e60c0cd3 Mon Sep 17 00:00:00 2001
From: Tex Riddell 
Date: Fri, 13 Sep 2024 18:56:58 -0700
Subject: [PATCH] [HLSL][clang] Add elementwise builtin for atan2 (p3)

This change is part of this proposal: 
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

- Add HLSL frontend for atan2
- Add clang Builtin, map to new llvm.atan2
- SemaChecking restrict to floating point and 2 args
- SemaHLSL restrict to float or half.
- Add to clang ReleaseNotes.rst and LanguageExtensions.rst
---
 clang/docs/LanguageExtensions.rst |  1 +
 clang/docs/ReleaseNotes.rst   |  2 +
 clang/include/clang/Basic/Builtins.td |  6 ++
 clang/lib/CodeGen/CGBuiltin.cpp   |  3 +
 clang/lib/Headers/hlsl/hlsl_intrinsics.h  | 30 ++
 clang/lib/Sema/SemaChecking.cpp   |  1 +
 clang/lib/Sema/SemaHLSL.cpp   |  1 +
 .../test/CodeGen/builtins-elementwise-math.c  | 20 +++
 .../CodeGen/strictfp-elementwise-bulitins.cpp | 10 
 clang/test/CodeGenHLSL/builtins/atan2.hlsl| 59 +++
 clang/test/Sema/aarch64-sve-vector-trig-ops.c |  6 ++
 clang/test/Sema/builtins-elementwise-math.c   | 24 
 clang/test/Sema/riscv-rvv-vector-trig-ops.c   |  6 ++
 .../SemaCXX/builtins-elementwise-math.cpp |  7 +++
 .../BuiltIns/half-float-only-errors2.hlsl |  7 +++
 15 files changed, 183 insertions(+)
 create mode 100644 clang/test/CodeGenHLSL/builtins/atan2.hlsl
 create mode 100644 clang/test/SemaHLSL/BuiltIns/half-float-only-errors2.hlsl

diff --git a/clang/docs/LanguageExtensions.rst 
b/clang/docs/LanguageExtensions.rst
index 0c6b9b1b8f9ce4..26da26d4670a37 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -660,6 +660,7 @@ Unless specified otherwise operation(±0) = ±0 and 
operation(±infinity) = ±in
  T __builtin_elementwise_asin(T x)   return the arcsine of x 
interpreted as an angle in radians   floating point types
  T __builtin_elementwise_acos(T x)   return the arccosine of x 
interpreted as an angle in radians floating point types
  T __builtin_elementwise_atan(T x)   return the arctangent of x 
interpreted as an angle in radiansfloating point types
+ T __builtin_elementwise_atan2(T y, T x) return the arctangent of y/x  
   floating point types
  T __builtin_elementwise_sinh(T x)   return the hyperbolic sine of 
angle x in radians floating point types
  T __builtin_elementwise_cosh(T x)   return the hyperbolic cosine of 
angle x in radians   floating point types
  T __builtin_elementwise_tanh(T x)   return the hyperbolic tangent of 
angle x in radians  floating point types
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 14907e7db18de3..8369f25e6583ce 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -536,6 +536,8 @@ DWARF Support in Clang
 Floating Point Support in Clang
 ---
 
+- Add ``__builtin_elementwise_atan2`` builtin for floating point types only.
+
 Fixed Point Support in Clang
 
 
diff --git a/clang/include/clang/Basic/Builtins.td 
b/clang/include/clang/Basic/Builtins.td
index 33791270800c9d..687ee7d5d43b62 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -1250,6 +1250,12 @@ def ElementwiseATan : Builtin {
   let Prototype = "void(...)";
 }
 
+def ElementwiseATan2 : Builtin {
+  let Spellings = ["__builtin_elementwise_atan2"];
+  let Attributes = [NoThrow, Const, CustomTypeChecking];
+  let Prototype = "void(...)";
+}
+
 def ElementwiseBitreverse : Builtin {
   let Spellings = ["__builtin_elementwise_bitreverse"];
   let Attributes = [NoThrow, Const, CustomTypeChecking];
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 9033cd1ccd781d..2c8e52d6777478 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -3835,6 +3835,9 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl 
GD, unsigned BuiltinID,
   case Builtin::BI__builtin_elementwise_atan:
 return RValue::get(emitBuiltinWithOneOverloadedType<1>(
 *this, E, llvm::Intrinsic::atan, "elt.atan"));
+  case Builtin::BI__builtin_elementwise_atan2:
+return RValue::get(emitBuiltinWithOneOverloadedType<2>(
+*this, E, llvm::Intrinsic::atan2, "elt.atan2"));
   cas

[clang] [HLSL][clang] Add elementwise builtin for atan2 (p3) (PR #110187)

2024-10-01 Thread Tex Riddell via cfe-commits


@@ -0,0 +1,7 @@
+// RUN: %clang_cc1 -finclude-default-header -triple 
dxil-pc-shadermodel6.6-library %s -fnative-half-type -emit-llvm-only 
-disable-llvm-passes -verify -DTEST_FUNC=__builtin_elementwise_atan2
+// RUN: %clang_cc1 -finclude-default-header -triple 
dxil-pc-shadermodel6.6-library %s -fnative-half-type -emit-llvm-only 
-disable-llvm-passes -verify -DTEST_FUNC=__builtin_elementwise_pow
+

tex3d wrote:

Sorry, I updated that one here.  Is that ok?

https://github.com/llvm/llvm-project/pull/110187
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL][clang] Add elementwise builtin for atan2 (p3) (PR #110187)

2024-09-30 Thread Tex Riddell via cfe-commits

https://github.com/tex3d updated 
https://github.com/llvm/llvm-project/pull/110187

>From 3669af3b40c85f1287f2c2fcb27bfc4282babd6a Mon Sep 17 00:00:00 2001
From: Tex Riddell 
Date: Fri, 13 Sep 2024 18:56:58 -0700
Subject: [PATCH] [HLSL][clang] Add elementwise builtin for atan2 (p3)

This change is part of this proposal: 
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

- Add HLSL frontend for atan2
- Add clang Builtin, map to new llvm.atan2
- SemaChecking restrict to floating point and 2 args
- SemaHLSL restrict to float or half.
- Add to clang ReleaseNotes.rst and LanguageExtensions.rst
---
 clang/docs/LanguageExtensions.rst |  1 +
 clang/docs/ReleaseNotes.rst   |  2 +
 clang/include/clang/Basic/Builtins.td |  6 ++
 clang/lib/CodeGen/CGBuiltin.cpp   |  3 +
 clang/lib/Headers/hlsl/hlsl_intrinsics.h  | 30 ++
 clang/lib/Sema/SemaChecking.cpp   |  1 +
 clang/lib/Sema/SemaHLSL.cpp   |  1 +
 .../test/CodeGen/builtins-elementwise-math.c  | 20 +++
 .../CodeGen/strictfp-elementwise-bulitins.cpp | 10 
 clang/test/CodeGenHLSL/builtins/atan2.hlsl| 59 +++
 clang/test/Sema/aarch64-sve-vector-trig-ops.c |  6 ++
 clang/test/Sema/builtins-elementwise-math.c   | 24 
 clang/test/Sema/riscv-rvv-vector-trig-ops.c   |  6 ++
 .../SemaCXX/builtins-elementwise-math.cpp |  7 +++
 .../BuiltIns/half-float-only-errors2.hlsl |  7 +++
 15 files changed, 183 insertions(+)
 create mode 100644 clang/test/CodeGenHLSL/builtins/atan2.hlsl
 create mode 100644 clang/test/SemaHLSL/BuiltIns/half-float-only-errors2.hlsl

diff --git a/clang/docs/LanguageExtensions.rst 
b/clang/docs/LanguageExtensions.rst
index ea4b4bcec55e77..c86b85d45b064c 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -660,6 +660,7 @@ Unless specified otherwise operation(±0) = ±0 and 
operation(±infinity) = ±in
  T __builtin_elementwise_asin(T x)   return the arcsine of x 
interpreted as an angle in radians   floating point types
  T __builtin_elementwise_acos(T x)   return the arccosine of x 
interpreted as an angle in radians floating point types
  T __builtin_elementwise_atan(T x)   return the arctangent of x 
interpreted as an angle in radiansfloating point types
+ T __builtin_elementwise_atan2(T y, T x) return the arctangent of y/x  
   floating point types
  T __builtin_elementwise_sinh(T x)   return the hyperbolic sine of 
angle x in radians floating point types
  T __builtin_elementwise_cosh(T x)   return the hyperbolic cosine of 
angle x in radians   floating point types
  T __builtin_elementwise_tanh(T x)   return the hyperbolic tangent of 
angle x in radians  floating point types
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 02dfbfaaea2071..d193378424ccda 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -565,6 +565,8 @@ DWARF Support in Clang
 Floating Point Support in Clang
 ---
 
+- Add ``__builtin_elementwise_atan2`` builtin for floating point types only.
+
 Fixed Point Support in Clang
 
 
diff --git a/clang/include/clang/Basic/Builtins.td 
b/clang/include/clang/Basic/Builtins.td
index 8090119e512fbb..b2eb747391ce07 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -1250,6 +1250,12 @@ def ElementwiseATan : Builtin {
   let Prototype = "void(...)";
 }
 
+def ElementwiseATan2 : Builtin {
+  let Spellings = ["__builtin_elementwise_atan2"];
+  let Attributes = [NoThrow, Const, CustomTypeChecking];
+  let Prototype = "void(...)";
+}
+
 def ElementwiseBitreverse : Builtin {
   let Spellings = ["__builtin_elementwise_bitreverse"];
   let Attributes = [NoThrow, Const, CustomTypeChecking];
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index d739597de4c855..0b7eb12589c6b7 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -3836,6 +3836,9 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl 
GD, unsigned BuiltinID,
   case Builtin::BI__builtin_elementwise_atan:
 return RValue::get(emitBuiltinWithOneOverloadedType<1>(
 *this, E, llvm::Intrinsic::atan, "elt.atan"));
+  case Builtin::BI__builtin_elementwise_atan2:
+return RValue::get(emitBuiltinWithOneOverloadedType<2>(
+*this, E, llvm::Intrinsic::atan2, "elt.atan2"));
   case Builtin::BI__builtin_elementwise_ceil:
 return RValue::get(emitBuiltinWithOneOverloadedType<1>(
 *this, E, llvm::Intrinsic::ceil, "elt.ceil"));
diff --git a/clang/lib/Headers/hlsl/hlsl_intrinsics.h 
b/clang/lib/Headers/hlsl/hlsl_intrinsics.h
index 810a16d75f0228..d28f204e352de5 100644
--- a/clang/lib/Headers/hlsl/hlsl_intrinsics.h
+++ b/clang/lib/Headers/hlsl/hlsl_

[clang] [HLSL][clang] Add elementwise builtin for atan2 (p3) (PR #110187)

2024-10-01 Thread Tex Riddell via cfe-commits

https://github.com/tex3d closed https://github.com/llvm/llvm-project/pull/110187
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Add __builtin_(elementwise|reduce)_(max|min)imum (PR #110198)

2024-10-01 Thread Tex Riddell via cfe-commits

tex3d wrote:

You should be able to re-land your change if you add the test fix to it.

https://github.com/llvm/llvm-project/pull/110198
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Add __builtin_(elementwise|reduce)_(max|min)imum (PR #110198)

2024-10-01 Thread Tex Riddell via cfe-commits

tex3d wrote:

I just pushed a fix for the test: 
https://github.com/llvm/llvm-project/commit/793ded7d0b7f1407636a98007f83074b8dd5f765

https://github.com/llvm/llvm-project/pull/110198
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Add __builtin_(elementwise|reduce)_(max|min)imum (PR #110198)

2024-10-01 Thread Tex Riddell via cfe-commits

tex3d wrote:

Oh dang.  That's why I didn't see an update.  Because the revert wasn't pushed 
to llvm/main, it was a revert on another repo.  Looks like I could re-apply the 
test fix again unless you're planning on doing a revert on llvm/main.

https://github.com/llvm/llvm-project/pull/110198
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 793ded7 - Fix failing test caused by b70d327

2024-10-01 Thread Tex Riddell via cfe-commits

Author: Tex Riddell
Date: 2024-10-01T18:05:05-07:00
New Revision: 793ded7d0b7f1407636a98007f83074b8dd5f765

URL: 
https://github.com/llvm/llvm-project/commit/793ded7d0b7f1407636a98007f83074b8dd5f765
DIFF: 
https://github.com/llvm/llvm-project/commit/793ded7d0b7f1407636a98007f83074b8dd5f765.diff

LOG: Fix failing test caused by b70d327

`clang/test/Sema/aarch64-sve-vector-trig-ops.c` wasn't updated when merging PR 
#110187, which changed the expected diagnostics for the atan2 test.

Added: 


Modified: 
clang/test/Sema/aarch64-sve-vector-trig-ops.c

Removed: 




diff  --git a/clang/test/Sema/aarch64-sve-vector-trig-ops.c 
b/clang/test/Sema/aarch64-sve-vector-trig-ops.c
index 31f608bf151099..3fe6834be2e0b7 100644
--- a/clang/test/Sema/aarch64-sve-vector-trig-ops.c
+++ b/clang/test/Sema/aarch64-sve-vector-trig-ops.c
@@ -25,7 +25,7 @@ svfloat32_t test_atan_vv_i8mf8(svfloat32_t v) {
 svfloat32_t test_atan2_vv_i8mf8(svfloat32_t v) {
 
   return __builtin_elementwise_atan2(v, v);
-  // expected-error@-1 {{1st argument must be a vector, integer or floating 
point type}}
+  // expected-error@-1 {{1st argument must be a floating point type}}
 }
 
 svfloat32_t test_sin_vv_i8mf8(svfloat32_t v) {



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 5d308af - Revert "Fix failing test caused by b70d327"

2024-10-01 Thread Tex Riddell via cfe-commits

Author: Tex Riddell
Date: 2024-10-01T18:11:20-07:00
New Revision: 5d308af894ccc3f7a288d6abd6f9097b4cbc8cf4

URL: 
https://github.com/llvm/llvm-project/commit/5d308af894ccc3f7a288d6abd6f9097b4cbc8cf4
DIFF: 
https://github.com/llvm/llvm-project/commit/5d308af894ccc3f7a288d6abd6f9097b4cbc8cf4.diff

LOG: Revert "Fix failing test caused by b70d327"

This reverts commit 793ded7d0b7f1407636a98007f83074b8dd5f765.

Added: 


Modified: 
clang/test/Sema/aarch64-sve-vector-trig-ops.c

Removed: 




diff  --git a/clang/test/Sema/aarch64-sve-vector-trig-ops.c 
b/clang/test/Sema/aarch64-sve-vector-trig-ops.c
index 3fe6834be2e0b7..31f608bf151099 100644
--- a/clang/test/Sema/aarch64-sve-vector-trig-ops.c
+++ b/clang/test/Sema/aarch64-sve-vector-trig-ops.c
@@ -25,7 +25,7 @@ svfloat32_t test_atan_vv_i8mf8(svfloat32_t v) {
 svfloat32_t test_atan2_vv_i8mf8(svfloat32_t v) {
 
   return __builtin_elementwise_atan2(v, v);
-  // expected-error@-1 {{1st argument must be a floating point type}}
+  // expected-error@-1 {{1st argument must be a vector, integer or floating 
point type}}
 }
 
 svfloat32_t test_sin_vv_i8mf8(svfloat32_t v) {



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL] Add `Increment`/`DecrementCounter` methods to structured buffers (PR #114148)

2024-11-07 Thread Tex Riddell via cfe-commits


@@ -343,27 +336,224 @@ struct TemplateParameterListBuilder {
 Params.clear();
 
 QualType T = Builder.Template->getInjectedClassNameSpecialization();
-T = S.Context.getInjectedClassNameType(Builder.Record, T);
+T = AST.getInjectedClassNameType(Builder.Record, T);
 
 return Builder;
   }
 };
+
+// Builder for methods of builtin types. Allows adding methods to builtin types
+// using the builder pattern like this:
+//
+//   BuiltinTypeMethodBuilder(Sema, RecordBuilder, "MethodName", ReturnType)
+//   .addParam("param_name", Type, InOutModifier)

tex3d wrote:

I don't see anything adding or using parameters here.  I also don't know how 
you'd use a parameter you added this way in the builtin call params you pass in 
(how do you reference it to create the DeclRefExpr?).

Since this part is unused, should we really be adding it at this time?  Could 
we add it when it's used to make sure it's working as intended?

https://github.com/llvm/llvm-project/pull/114148
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL] Add `Increment`/`DecrementCounter` methods to structured buffers (PR #114148)

2024-11-07 Thread Tex Riddell via cfe-commits


@@ -4883,6 +4882,12 @@ def HLSLRadians : LangBuiltin<"HLSL_LANG"> {
   let Prototype = "void(...)";
 }
 
+def HLSLBufferUpdateCounter : LangBuiltin<"HLSL_LANG"> {
+  let Spellings = ["__builtin_hlsl_buffer_update_counter"];
+  let Attributes = [NoThrow];
+  let Prototype = "uint32_t(...)";

tex3d wrote:

Why does this look like it's overloaded?  Shouldn't this have a specific 
signature like `"uint32_t(__hlsl_resource_t, int)"`?

https://github.com/llvm/llvm-project/pull/114148
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL] Add `Increment`/`DecrementCounter` methods to structured buffers (PR #114148)

2024-11-07 Thread Tex Riddell via cfe-commits


@@ -109,22 +116,16 @@ struct BuiltinTypeDeclBuilder {
   }
 
   BuiltinTypeDeclBuilder &
-  addHandleMember(Sema &S, ResourceClass RC, ResourceKind RK, bool IsROV,
-  bool RawBuffer,
+  addHandleMember(ResourceClass RC, ResourceKind RK, bool IsROV, bool 
RawBuffer,
   AccessSpecifier Access = AccessSpecifier::AS_private) {
-if (Record->isCompleteDefinition())
-  return *this;
+assert(!Record->isCompleteDefinition() && "record is already complete");
 
 ASTContext &Ctx = S.getASTContext();
 TypeSourceInfo *ElementTypeInfo = nullptr;
 
 QualType ElemTy = Ctx.Char8Ty;

tex3d wrote:

Pre-existing side-note: Does this mean that `ByteAddressBuffer` will have an 
element type of `i8`?  Shouldn't it be `void`, to clearly differentiate this 
case?

https://github.com/llvm/llvm-project/pull/114148
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL] Add `Increment`/`DecrementCounter` methods to structured buffers (PR #114148)

2024-11-07 Thread Tex Riddell via cfe-commits


@@ -271,53 +246,70 @@ struct BuiltinTypeDeclBuilder {
 return *this;
   }
 
+  FieldDecl *getResourceHandleField() {
+FieldDecl *FD = Fields["h"];

tex3d wrote:

Won't this add "h" when accessed this way?
How about:
```suggestion
FieldDecl *FD = Fields.lookup("h");
```

https://github.com/llvm/llvm-project/pull/114148
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-23 Thread Tex Riddell via cfe-commits

https://github.com/tex3d approved this pull request.


https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-23 Thread Tex Riddell via cfe-commits


@@ -2074,6 +2083,19 @@ bool SemaHLSL::CheckBuiltinFunctionCall(unsigned 
BuiltinID, CallExpr *TheCall) {
   return true;
 break;
   }
+  case Builtin::BI__builtin_hlsl_elementwise_splitdouble: {
+if (SemaRef.checkArgCount(TheCall, 3))
+  return true;
+
+if (CheckScalarOrVector(&SemaRef, TheCall, SemaRef.Context.DoubleTy, 0) ||
+CheckScalarOrVector(&SemaRef, TheCall, SemaRef.Context.UnsignedIntTy,
+1) ||
+CheckScalarOrVector(&SemaRef, TheCall, SemaRef.Context.UnsignedIntTy,
+2))
+  return true;

tex3d wrote:

I just realized that we don't have tests for the 
`__builtin_hlsl_elementwise_splitdouble` operation for these diagnostics.  I 
think we need checks similar to those found in 
`clang/test/Sema/builtins-elementwise-math.c` but in a new test under 
`clang/test/SemaHLSL/BuiltIns/`.

https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Adding splitdouble HLSL function (PR #109331)

2024-10-24 Thread Tex Riddell via cfe-commits

https://github.com/tex3d requested changes to this pull request.


https://github.com/llvm/llvm-project/pull/109331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   >