[PATCH] D120566: [OpenCL][AMDGPU]: Do not allow a call to kernel

2022-02-25 Thread John McCall via Phabricator via cfe-commits
rjmccall added a comment. In D120566#3346533 , @arsenm wrote: > In D120566#3346506 , @rjmccall > wrote: > >> Is there something which stops you from taking the address of a kernel and >> then calling it? If not

[PATCH] D120566: [OpenCL][AMDGPU]: Do not allow a call to kernel

2022-02-25 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D120566#3346506 , @rjmccall wrote: > Is there something which stops you from taking the address of a kernel and > then calling it? If not, are there actually any uses of kernels in the > module that shouldn't be rewritten as

[PATCH] D120566: [OpenCL][AMDGPU]: Do not allow a call to kernel

2022-02-25 Thread John McCall via Phabricator via cfe-commits
rjmccall added a comment. Is there something which stops you from taking the address of a kernel and then calling it? If not, are there actually any uses of kernels in the module that shouldn't be rewritten as uses of the clone? I feel like this would be a lot easier to just fix in your LLVM p

[PATCH] D120566: [OpenCL][AMDGPU]: Do not allow a call to kernel

2022-02-25 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:9238 +static llvm::Function *getKernelClone(llvm::Function &F) { + llvm::Module *M = F.getParent(); I don't think we can really start with the function IR. The TargetABIInfo could be d

[PATCH] D120566: [OpenCL][AMDGPU]: Do not allow a call to kernel

2022-02-25 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D120566#3345604 , @yaxunl wrote: > One of my concerns is that all kernels are duplicated which may cause code > object size doubled. Not really, the kernel should just be a stub that calls the real implementation function. In

[PATCH] D120566: [OpenCL][AMDGPU]: Do not allow a call to kernel

2022-02-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. One of my concerns is that all kernels are duplicated which may cause code object size doubled. Do we need to make the clone always_inline and let the kernel call its clone to avoid duplicate function bodies? Or LLVM has some pass to do that? Another concern is that the

[PATCH] D120566: [OpenCL][AMDGPU]: Do not allow a call to kernel

2022-02-25 Thread Christudasan Devadasan via Phabricator via cfe-commits
cdevadas created this revision. cdevadas added reviewers: rjmccall, Anastasia, yaxunl, arsenm. Herald added subscribers: Naghasan, ldrumm, kerbowa, t-tye, tpr, dstuttard, jvesely, kzhuravl. cdevadas requested review of this revision. Herald added subscribers: cfe-commits, wdng. Herald added a proj