================
@@ -1562,6 +1562,23 @@ def HIPManaged : InheritableAttr {
let Documentation = [HIPManagedAttrDocs];
}
+def CUDAClusterDims : InheritableAttr {
+ let Spellings = [GNU<"cluster_dims">, Declspec<"__cluster_dims__">];
+ let Args = [ExprArgument<"X">, ExprArgument<"Y", 1>, ExprArgument<"Z", 1>];
+ let Subjects = SubjectList<[Function], ErrorDiag, "kernel functions">;
+ let LangOpts = [CUDA];
+ let Documentation = [CUDAClusterDimsAttrDoc];
+}
+
+def CUDANoCluster : InheritableAttr {
+ let Spellings = [GNU<"no_cluster">, Declspec<"__no_cluster__">];
----------------
shiltian wrote:
If a kernel doesn't have `__cluster_dims__`, user can still enable the cluster
feature at runtime during kernel launch. That means the compiler has to be
conservative about cluster-related handling in the backend and assume the
feature could be used.
On the other hand, `__no_cluster__` tells the compiler the cluster feature will
not be enabled at runtime. This lets the backend optimize if needed. For
AMDGPU, it helps the compiler avoid querying certain registers when lowering
some workgroup-related intrinsics.
https://github.com/llvm/llvm-project/pull/156686
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits