I will try fixing that. The CUDA kernel calling convention should be dropped in all DRE's since it is invisible to the user.
Sam -----Original Message----- From: Artem Belevich via Phabricator [mailto:[email protected]] Sent: Tuesday, April 03, 2018 1:51 PM To: Liu, Yaxun (Sam) <[email protected]>; [email protected]; Arsenault, Matthew <[email protected]> Cc: [email protected]; [email protected]; Zhuravlyov, Konstantin <[email protected]>; [email protected]; Stuttard, David <[email protected]>; [email protected]; Tye, Tony <[email protected]>; [email protected]; [email protected] Subject: [PATCH] D44747: Set calling convention for CUDA kernel tra added inline comments. ================ Comment at: lib/Sema/SemaType.cpp:3319-3330 + // Attribute AT_CUDAGlobal affects the calling convention for AMDGPU targets. + // This is the simplest place to infer calling convention for CUDA kernels. + if (S.getLangOpts().CUDA && S.getLangOpts().CUDAIsDevice) { + for (const AttributeList *Attr = D.getDeclSpec().getAttributes().getList(); + Attr; Attr = Attr->getNext()) { + if (Attr->getKind() == AttributeList::AT_CUDAGlobal) { + CC = CC_CUDAKernel; ---------------- tra wrote: > This apparently breaks compilation of some CUDA code in our internal tests. > I'm working on minimizing a reproduction case. Should this code be enabled > for AMD GPUs only? Here's a small snippet of code that previously used to compile and work: ``` template <typename T> __global__ void EmptyKernel(void) { } struct Dummy { /// Type definition of the EmptyKernel kernel entry point typedef void (*EmptyKernelPtr)(); EmptyKernelPtr Empty() { return EmptyKernel<void>; } }; ``` AFAICT, it's currently impossible to apply __global__ to pointers, so there's no way to make the code above work with this patch applied. Repository: rL LLVM https://reviews.llvm.org/D44747 _______________________________________________ cfe-commits mailing list [email protected] http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
