[PATCH] D101630: [HIP] Fix device-only compilation

Yaxun Liu via Phabricator via cfe-commits Tue, 01 Jun 2021 13:21:30 -0700

yaxunl added a comment.

In D101630#2791734 <https://reviews.llvm.org/D101630#2791734>, @tra wrote:


> In D101630#2787714 <https://reviews.llvm.org/D101630#2787714>, @yaxunl wrote:
>
>> How does nvcc --genco behave when there are multiple GPU arch's? Does it 
>> output a fat binary containing multiple ISA's? Also, does it support 
>> device-only compilation for intermediate outputs?
>
> It does not allow multiple outputs for `-ptx` and `-cubin` compilations, same 
> as clang behaves now:
>
>   $ ~/local/cuda-11.3/bin/nvcc -gencode=arch=compute_60,code=sm_60 
> -gencode=arch=compute_70,code=sm_70 -ptx foo.cu
>   nvcc fatal   : Option '--ptx (-ptx)' is not allowed when compiling for 
> multiple GPU architectures
>
> NVCC does allow `-E` with multiple targets, but it does produce output for 
> only *one* of them.
>
> NVCC does bundle outputs for multiple GPU variants if `-fatbin` is used.

I think for intermediate outputs e.g. preprocessor expansion, IR, and assembly, 
probably it makes sense not to bundle by default. However, for default action 
(emitting object), we need to bundle by default since it was the old behavior 
and existing HIP apps depend on that. Then we allow -fhip-bundle-device-output 
to override the default behavior.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101630/new/

https://reviews.llvm.org/D101630

_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D101630: [HIP] Fix device-only compilation

Reply via email to