================
@@ -5692,7 +5692,10 @@ CGCallee CodeGenFunction::EmitCallee(const Expr *E) {
// Resolve direct calls.
} else if (auto DRE = dyn_cast<DeclRefExpr>(E)) {
if (auto FD = dyn_cast<FunctionDecl>(DRE->getDecl())) {
- return EmitDirectCallee(*this, FD);
+ auto CalleeDecl = FD->hasAttr<OpenCLKernelAttr>()
+ ? GlobalDecl(FD, KernelReferenceKind::Stub)
+ : FD;
+ return EmitDirectCallee(*this, CalleeDecl);
----------------
rjmccall wrote:
Hmm. It looks like the CUDA folks had this same problem and came up with an
awkward workaround for it in `EmitDirectCallee`. We should really just be
requesting the right GD in the first place. Could you add a
`getGlobalDeclForDirectCall` function that does the right thing for both modes?
If it ends up causing complicated behavior/test changes in CUDA mode, you can
feel free to exclude CUDA for now and just leave a comment saying that the
workaround should be removed in favor of doing the right thing in that function.
https://github.com/llvm/llvm-project/pull/115821
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits