================ Comment at: include/clang/Driver/CC1Options.td:611 @@ -610,1 +610,3 @@ +def cuda_include_gpucode : Separate<["-"], "cuda-include-gpucode">, + HelpText<"Incorporate CUDA device-side code.">; ---------------- eliben wrote: > tra wrote: > > eliben wrote: > > > I'm wondering about the "gpucode" mnemonic :-) It's unusual and kinda > > > ambiguous. What does gpucode mean here? PTX? Maybe PTX can be more > > > explicit then? > > > > > > PTX is probably not too specific since this flag begins with "cuda_" so > > > it's already about the CUDA/PTX flow. > > > > > > [this applies to other uses of "gpucode" too] > > It's actually an opaque blob. clang does not care what's in the file as it > > just passes the bits to cudart which passes it to the driver. The driver > > can digest PTX (which we pass in this case), but it will as happily accept > > GPU code packed in fatbin or cubin formats. If/when we grow ability to > > compile device-side to SASS, we would just do "-cuda-include-gpucode > > gpu-code-packed-in.cubin" and it should work with no other changes on the > > host side. > > > > So, 'gpucode' was the best approximation I could come up with that would > > keep "GPU code in any shape or form as long as it's PTX/fatbin or cubin". > > > > I'd be happy to change it. Suggestions? > I see - some generic mnemonic is needed, I agree (so PTX is not a good idea). > But "--gpu-code" is a nvcc flag that means something completely different :-/ > So "gpu code" here may still be confusing. Maybe "gpublob" or "gpuobject" or > "gpubinary" or something like that. I can't think of a perfect solution right > now. > > I'll leave it to your discretion. gpubinary wins.
================ Comment at: lib/CodeGen/CGCUDARuntime.h:42 @@ -34,1 +41,3 @@ + llvm::SmallVector<llvm::Function *, 16> EmittedKernels; + llvm::SmallVector<llvm::GlobalVariable *, 16> FatbinHandles; ---------------- eliben wrote: > tra wrote: > > eliben wrote: > > > It would really be great not to have data inside this abstract interface; > > > is this necessary? > > > > > > Note that "fatbin handles" sounds very NVIDIA CUDA runtime specific, > > > though this interface is allegedly generic :) > > List of generated kernels is something that I expect to be useful for all > > subclasses of CUDARuntime. > > That's why I've put EmittedKernels there and a non-virtual > > methodEmitDeviceStub() to populate it. > > > > FatbinHandles, on the other hand, is indeed cudart-specific. I've moved it > > into CGCUDANV. > I would still remove EmittedKernels for now; we only have a single CUDA > runtime at this time in upstream, so this feels redundant, as it makes the > runtime interface / implementation barrier less clean than it should be. In > the future if/when new runtime implementations are added, we'll figure out > what's the best way to factor common code out is. > > YAGNI, essentially :) OK. http://reviews.llvm.org/D8463 EMAIL PREFERENCES http://reviews.llvm.org/settings/panel/emailpreferences/ _______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
