yaxunl added a comment.

In D60620#1464633 <https://reviews.llvm.org/D60620#1464633>, @tra wrote:

> It looks like you are solving two problems here.
>  a) you want to create multiple device passes for the same GPU, but with 
> different options.
>  b) you may want to pass different compiler options to different device 
> compilations.
>  The patch effectively hard-codes {gpu, options} tuple into 
> --offloading-target-id variant.
>  Is that correct?
>
> This looks essentially the same as your previous patch D59863 
> <https://reviews.llvm.org/D59863>.
>
> We have a limited way to deal with (b), but there's currently no way to deal 
> with (a).
>
> For (a), I think, the real problem is that until now we've assumed that 
> there's only one device-side compilation per target GPU arch. If we need 
> multiple device-side compilations, we need a way to name them.  Using 
> `offloading-target-id` as  a super-set of `--cuda-gpu-arch` is OK with me. 
> However, I'm on the fence about the option serving a double-duty of setting 
> magic compiler flags. On one hand, that's what driver does, so it may be OK. 
> On the other hand, it's unnecessarily strict. I.e. if we provide ability to 
> create multiple compilation passes for the same GPU arch, why limit that to 
> only changing those hard-coded options? A general approach would allow a way 
> to create more than one device-side compilation and provide arbitrary 
> compiler options only to *that* compilation. Thiw will also help solving 
> number of issues we have right now when some host-side compilation options 
> break device-side compilation and we have to work around that by filtering 
> out some of them in the driver.


This patch is trying to solve the issue about GPU arch explosion due to 
combination of GPU configurations. A GPU may have several configurations which 
require different ISA's. From the compiler point of view, the GPU plus 
configuration behaves like different GPU archs. Previously we have been using 
different gfx names for the same GPU with different configurations. However, 
that does not scale. Therefore in this patch we extend GPU arch to `target id`, 
which is something like gpu+feature1-feature2.

The features allowed in target id are not arbitrary target features. They 
corresponding a limited number of GPU configurations that HIP runtime 
understands. Basically HIP runtime looks at the target id of the device objects 
in a fat binary and knows which one is best for the current GPU configuration. 
On the other hand, this is not some feature that can be easily implemented by 
users, since it needs knowledge about GPU configurations and corresponding 
compiler options for such configurations. Therefore, this is some feature 
better implemented within HIP compiler/runtime.

For embedding multiple device binaries for the same GPU but compiled with 
different options in one fat binary, since HIP runtime does not know which one 
to load, I don't think it is useful. On the other hand, users can always 
implement their own mechanisms for using device binaries compiled with 
different options with their own logic about how to choose them, therefore this 
is better left to the users.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60620/new/

https://reviews.llvm.org/D60620



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to