================
Comment at: include/clang/Driver/CC1Options.td:611
@@ -610,1 +610,3 @@
 
+def cuda_include_gpucode : Separate<["-"], "cuda-include-gpucode">,
+  HelpText<"Incorporate CUDA device-side code.">;
----------------
eliben wrote:
> tra wrote:
> > eliben wrote:
> > > I'm wondering about the "gpucode" mnemonic :-) It's unusual and kinda 
> > > ambiguous. What does gpucode mean here? PTX? Maybe PTX can be more 
> > > explicit then?
> > > 
> > > PTX is probably not too specific since this flag begins with "cuda_" so 
> > > it's already about the CUDA/PTX flow.
> > > 
> > > [this applies to other uses of "gpucode" too]
> > It's actually an opaque blob. clang does not care what's in the file as it 
> > just passes the bits to cudart which passes it to the driver. The driver 
> > can digest PTX (which we pass in this case), but it will as happily accept 
> > GPU code packed in fatbin or cubin formats. If/when we grow ability to 
> > compile device-side to SASS, we would just  do "-cuda-include-gpucode 
> > gpu-code-packed-in.cubin" and it should work with no other changes on the 
> > host side.
> > 
> > So, 'gpucode' was the best approximation I could come up with that would 
> > keep "GPU code in any shape or form as long as it's PTX/fatbin or cubin".
> > 
> > I'd be happy to change it. Suggestions?
> I see - some generic mnemonic is needed, I agree (so PTX is not a good idea). 
> But "--gpu-code" is a nvcc flag that means something completely different :-/ 
> So "gpu code" here may still be confusing. Maybe "gpublob" or "gpuobject" or 
> "gpubinary" or something like that. I can't think of a perfect solution right 
> now.
> 
>  I'll leave it to your discretion.
gpubinary wins.

================
Comment at: lib/CodeGen/CGCUDARuntime.h:42
@@ -34,1 +41,3 @@
 
+  llvm::SmallVector<llvm::Function *, 16> EmittedKernels;
+  llvm::SmallVector<llvm::GlobalVariable *, 16> FatbinHandles;
----------------
eliben wrote:
> tra wrote:
> > eliben wrote:
> > > It would really be great not to have data inside this abstract interface; 
> > > is this necessary?
> > > 
> > > Note that "fatbin handles" sounds very NVIDIA CUDA runtime specific, 
> > > though this interface is allegedly generic :)
> > List of generated kernels is something that I expect to be useful for all 
> > subclasses of CUDARuntime. 
> > That's why I've put EmittedKernels there and a non-virtual 
> > methodEmitDeviceStub() to populate it.
> > 
> > FatbinHandles, on the other hand, is indeed cudart-specific. I've moved it 
> > into CGCUDANV.
> I would still remove EmittedKernels for now; we only have a single CUDA 
> runtime at this time in upstream, so this feels redundant, as it makes the 
> runtime interface / implementation barrier less clean than it should be. In 
> the future if/when new runtime implementations are added, we'll figure out 
> what's the best way to factor common code out is. 
> 
> YAGNI, essentially :)
OK.

http://reviews.llvm.org/D8463

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/



_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Reply via email to