tianshilei1992 added a comment.

In general we're moving to the direction that target specific implementation 
will be compiled along with user code, which is fantastic. In this way, we only 
need to provide one bitcode library for one target. The change in FE lacks of 
some efficiency. If user code has multiple files, target specific header will 
be included multiple times, thus compiled multiple times. A more efficient way 
is to change the workflow of the driver, probably in the following way:

1. Compile target implementation `t.bc`
2. Link `t.bc` and `libomptarget-[arch].bc` to `libomptarget.bc`
3. Compile user code, which is also multiple steps. `libomptarget.bc` is fed 
into FE in this step.
4. Remaining steps...



================
Comment at: clang/lib/Driver/ToolChains/Clang.cpp:1204
+    {
+      auto *CTC = static_cast<const toolchains::CudaToolChain *>(
+          C.getSingleOffloadToolChain<Action::OFK_Cuda>());
----------------
JonChesterfield wrote:
> Logic very like this could pick out a second, small devicertl bitcode library
can we just use one header with different macros, like what we're using now?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D95313/new/

https://reviews.llvm.org/D95313

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to