[clang] [Clang][OpenCL][AMDGPU] Allow a kernel to call another kernel (PR #115821)

John McCall via cfe-commits Mon, 02 Dec 2024 00:04:44 -0800

================
@@ -5692,7 +5692,10 @@ CGCallee CodeGenFunction::EmitCallee(const Expr *E) {
   // Resolve direct calls.
   } else if (auto DRE = dyn_cast<DeclRefExpr>(E)) {
     if (auto FD = dyn_cast<FunctionDecl>(DRE->getDecl())) {
-      return EmitDirectCallee(*this, FD);
+      auto CalleeDecl = FD->hasAttr<OpenCLKernelAttr>()
+                            ? GlobalDecl(FD, KernelReferenceKind::Stub)
+                            : FD;
+      return EmitDirectCallee(*this, CalleeDecl);
----------------
rjmccall wrote:


Hmm.  It looks like the CUDA folks had this same problem and came up with an 
awkward workaround for it in `EmitDirectCallee`. We should really just be 
requesting the right GD in the first place. Could you add a 
`getGlobalDeclForDirectCall` function that does the right thing for both modes? 
 If it ends up causing complicated behavior/test changes in CUDA mode, you can 
feel free to exclude CUDA for now and just leave a comment saying that the 
workaround should be removed in favor of doing the right thing in that function.

https://github.com/llvm/llvm-project/pull/115821
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][OpenCL][AMDGPU] Allow a kernel to call another kernel (PR #115821)

Reply via email to