jhuber6 added a comment.
Why do we have the JIT in the nextgen plugins? I figured that JIT would be
handled by `libomptarget` proper rather than the plugins. I guess this is
needed for per-kernel specialization? My idea of the rough pseudocode would be
like this and we wouldn't need a complex class heirarchy. Also I don't know if
we can skip `ptxas` by giving CUDA the ptx directly, we probably will need to
invoke `lld` on the command line however right.
for each image:
if image is bitcode
image = compile(image)
register(image)
================
Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:879
"Cannot embed bitcode with multiple files.");
- OutputFiles.push_back(static_cast<std::string>(BitcodeOutput.front()));
+ OutputFiles.push_back(Args.MakeArgString(BitcodeOutput.front()));
return Error::success();
----------------
tianshilei1992 wrote:
> This will be pushed by Joseph in another patch.
Did that this morning.
================
Comment at:
openmp/libomptarget/plugins-nextgen/common/PluginInterface/CMakeLists.txt:24
# Plugin Interface library.
-add_library(PluginInterface OBJECT PluginInterface.cpp GlobalHandler.cpp)
+add_llvm_library(PluginInterface OBJECT PluginInterface.cpp GlobalHandler.cpp
JIT.cpp)
----------------
tianshilei1992 wrote:
> I guess this might cause the issue of non-protected global symbols.
Should we be able to put all this in the `add_llvm_library`?
================
Comment at:
openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp:47-51
+ InitializeAllTargetInfos();
+ InitializeAllTargets();
+ InitializeAllTargetMCs();
+ InitializeAllAsmParsers();
+ InitializeAllAsmPrinters();
----------------
We could probably limit these to the ones we actually care about since we know
the triples. Not sure if it would save us much runtime.
================
Comment at:
openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp:184
+
+ auto AddStream =
+ [&](size_t Task,
----------------
tianshilei1992 wrote:
> Is there any way that we don't write it to a file here?
Why do we need to invoke LTO here? I figured that we could call the backend
directly since we have no need to actually link any filies, and we may not have
a need to run more expensive optimizations when the bitcode is already
optimized. If you do that then you should be able to just use a
`raw_svector_ostream` as your output stream and get the compiled output written
to that buffer.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D139287/new/
https://reviews.llvm.org/D139287
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits