jhuber6 added a comment. Why do we have the JIT in the nextgen plugins? I figured that JIT would be handled by `libomptarget` proper rather than the plugins. I guess this is needed for per-kernel specialization? My idea of the rough pseudocode would be like this and we wouldn't need a complex class heirarchy. Also I don't know if we can skip `ptxas` by giving CUDA the ptx directly, we probably will need to invoke `lld` on the command line however right.
for each image: if image is bitcode image = compile(image) register(image) ================ Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:879 "Cannot embed bitcode with multiple files."); - OutputFiles.push_back(static_cast<std::string>(BitcodeOutput.front())); + OutputFiles.push_back(Args.MakeArgString(BitcodeOutput.front())); return Error::success(); ---------------- tianshilei1992 wrote: > This will be pushed by Joseph in another patch. Did that this morning. ================ Comment at: openmp/libomptarget/plugins-nextgen/common/PluginInterface/CMakeLists.txt:24 # Plugin Interface library. -add_library(PluginInterface OBJECT PluginInterface.cpp GlobalHandler.cpp) +add_llvm_library(PluginInterface OBJECT PluginInterface.cpp GlobalHandler.cpp JIT.cpp) ---------------- tianshilei1992 wrote: > I guess this might cause the issue of non-protected global symbols. Should we be able to put all this in the `add_llvm_library`? ================ Comment at: openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp:47-51 + InitializeAllTargetInfos(); + InitializeAllTargets(); + InitializeAllTargetMCs(); + InitializeAllAsmParsers(); + InitializeAllAsmPrinters(); ---------------- We could probably limit these to the ones we actually care about since we know the triples. Not sure if it would save us much runtime. ================ Comment at: openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp:184 + + auto AddStream = + [&](size_t Task, ---------------- tianshilei1992 wrote: > Is there any way that we don't write it to a file here? Why do we need to invoke LTO here? I figured that we could call the backend directly since we have no need to actually link any filies, and we may not have a need to run more expensive optimizations when the bitcode is already optimized. If you do that then you should be able to just use a `raw_svector_ostream` as your output stream and get the compiled output written to that buffer. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D139287/new/ https://reviews.llvm.org/D139287 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits