yaxunl added a comment.

In D56318#1352962 <https://reviews.llvm.org/D56318#1352962>, @rjmccall wrote:

> If I was only concerned about `size_t`, your current solution would be fine.  
> My concern is that you really need to match *all* of the associated CPU 
> target's ABI choices, so your target really ought to be forwarding everything 
> to that target by default and only selectively overriding it in order to 
> support GPU-specific features.   Probably the easiest way to do that is via 
> inheritance.


We only need to match the type size and alignment in device and host 
compilation, but do not need to match function call ABI. In fact our backend 
has its own function ABI which is different from host on linux, but it does not 
preventing us from supporting HIP on linux. This is because the device kernel 
is launched through HIP runtime, which gets kernel argument size and offset 
from kernel image, and lays out the arguments for the kernel.

The latest CUDA kernel launching API cuLaunchKernel 
(https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__EXEC.html#group__CUDA__EXEC_1gb8f3dc3031b40da29d5f9a7139e52e15)
 . Basically the host code only needs to pass an array of pointer to the 
arguments, whereas "the number of kernel parameters and their offsets and sizes 
do not need to be specified as that information is retrieved directly from the 
kernel's image".

If the device backend has to switch to different ABI according to host 
environment, that will be very painful for the backend.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D56318/new/

https://reviews.llvm.org/D56318



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to