jhuber6 added a comment.

In D139287#3971024 <https://reviews.llvm.org/D139287#3971024>, @tianshilei1992 
wrote:

> In D139287#3970996 <https://reviews.llvm.org/D139287#3970996>, @jhuber6 wrote:
>
>> Why do we have the JIT in the nextgen plugins? I figured that JIT would be 
>> handled by `libomptarget` proper rather than the plugins. I guess this is 
>> needed for per-kernel specialization? My idea of the rough pseudocode would 
>> be like this and we wouldn't need a complex class heirarchy. Also I don't 
>> know if we can skip `ptxas` by giving CUDA the ptx directly, we probably 
>> will need to invoke `lld` on the command line however right.
>>
>>   for each image:
>>     if image is bitcode
>>       image = compile(image)
>>    register(image)
>
> We could handle them in `libomptarget`, but that's gonna require we add 
> another two interface functions: `is_valid_bitcode_image`, and 
> `compile_bitcode_image`. It is doable. Handling them in plugin as a separate 
> module can just reuse the two existing interfaces.

Would we need to consult the plugin? We can just check the `magic` directly, if 
it's bitcode we just compile it for its triple. If this was wrong then when the 
plugin gets the compiled image it will error.

>> Also I don't know if we can skip `ptxas` by giving CUDA the ptx directly, we 
>> probably will need to invoke `lld` on the command line however right.
>>
>>   for each image:
>>     if image is bitcode
>>       image = compile(image)
>>    register(image)
>
> We can give CUDA PTX directly, since the CUDA JIT is to just call `ptxas` 
> instead of `ptxas -c`, which requires `nvlink` afterwards.

That makes it easier for us, so the only command line tool we need to call is 
`lld` for AMDGPU.



================
Comment at: 
openmp/libomptarget/plugins-nextgen/common/PluginInterface/JIT.cpp:184
+
+  auto AddStream =
+      [&](size_t Task,
----------------
tianshilei1992 wrote:
> jhuber6 wrote:
> > tianshilei1992 wrote:
> > > Is there any way that we don't write it to a file here?
> > Why do we need to invoke LTO here? I figured that we could call the backend 
> > directly since we have no need to actually link any filies, and we may not 
> > have a need to run more expensive optimizations when the bitcode is already 
> > optimized. If you do that then you should be able to just use a 
> > `raw_svector_ostream` as your output stream and get the compiled output 
> > written to that buffer.
> For the purpose of this basic JIT support, we indeed just need backend. 
> However, since we have the plan for super optimization, etc., having an 
> optimization pipeline here is also useful.
We should be able to configure our own optimization pipeline in that case, we 
might want the extra control as well.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D139287/new/

https://reviews.llvm.org/D139287

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to