tra added a comment.

"Interoperability with other compilers" is probably a statement that's a bit 
too strong. At best it's kind of compatible with CUDA tools and I don't think 
it's feasible for other compilers. I.e. it will be useless for AMD GPUs and 
whatever compiler they use.

In general it sounds like you're going back to what regular CUDA compilation 
pipeline does:

- [clang] C++->.ptx
- [ptxas] .ptx -> .cubin
- [fatbin] .cubin -> .fatbin
- [clang] C++ + .fatbin -> host .o

On one hand I can see how being able to treat GPU-side binaries as any other 
host files is convenient. On the other hand, this convenience comes with the 
price of targeting only NVPTX. This seems contrary to OpenMP's goal of 
supporting many different kinds of accelerators. I'm not sure what's the 
consensus in the OpenMP community these days, but I vaguely recall that generic 
bundling/unbundling was explicitly chosen over vendor-specific encapsulation in 
host .o when the bundling was implemented. If the underlying reasons have 
changed since then it would be great to hear more details about that.

Assuming we do proceed with back-to-CUDA approach, one thing I'd consider would 
be using clang's -fcuda-include-gpubinary option which CUDA uses to include GPU 
code into the host object. You may be able to use it to avoid compiling and 
partially linking .fatbin and host .o.


Repository:
  rC Clang

https://reviews.llvm.org/D47394



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to