================
@@ -176,3 +176,34 @@ Predefined Macros
    * - ``HIP_API_PER_THREAD_DEFAULT_STREAM``
      - Alias to ``__HIP_API_PER_THREAD_DEFAULT_STREAM__``. Deprecated.
 
+C++20 Concepts with HIP and CUDA
+--------------------------------
+
+In Clang, when working with HIP or CUDA, it's important to note that all 
constraints in C++20 concepts are assumed to be for the host side only. This 
behavior is consistent across both programming models, and developers should be 
aware of this assumption when writing code that utilizes C++20 concepts.
+
+Example:
+.. code-block:: c++
+
+   template <class T>
+   concept MyConcept = requires(T& obj) {
+     my_function(obj);  // Assumed to be a host-side requirement
----------------
yxsamliu wrote:

For a typical use case of concept in CUDA programs, please see 
https://godbolt.org/z/o7Wa68n9c

This is taken from issue https://github.com/llvm/llvm-project/issues/67507.

In this example, users want to express two constraints on geometric_shape:

1. it can be passed to a function draw

2. it can be passed to a function area and the result is convertible to double

For the first constraint, users only need it on the host side. For the second 
constraint, users need it on both the host side and the device side. This gives 
us some insight into users' needs for constraints: they are usually different 
for host and device sides, since users may want to do different things on host 
and device sides. Therefore, assuming a constraint in a concept should be 
satisfied on both the device and host sides will result in some unnecessary 
extra constraints on either side.

Is it OK to evaluate the constraints by the context where the template is 
instantiated? For example, when we instantiate the kernel `template 
<geometric_shape T> __global__ void compute_areas`, can we evaluate the 
constraints in the device context to get what we need? It is not good. Because 
then the constraint about function draw needs to be satisfied on the device 
side. That is not what we need. The point is, that the constraints defined in a 
concept need to have individual required context. We want to be able to express 
that this constraint should be satisfied in the device context, and that 
constraint should be satisfied in the host context. That is why I propose to 
allow `__device__` and `__host__` attributes to be added to the call 
expressions in concepts to indicate the required context for an individual 
constraint.

Now that we have discussed the users' needs regarding device/host contexts of 
constraints. Let's look at how nvcc currently evaluates satisfaction of 
constraints.

Based on https://godbolt.org/z/o7Wa68n9c , the instantiation of 
`work<triangle>` is successful. We can infer that `triangle` satisfies the two 
constraints. They can only be evaluated in the host context since functions 
`draw` and `area` are all host functions. Even though the instantiation of 
`work<triangle>` is done in a device context, the evaluation of the constraints 
is still done in the host context.

The current patch matches nvcc's behaviour.

https://github.com/llvm/llvm-project/pull/67721
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to