[PATCH] D101630: [HIP] Fix device-only compilation

Yaxun Liu via Phabricator via cfe-commits Fri, 30 Apr 2021 10:26:59 -0700

yaxunl added a comment.

In D101630#2729573 <https://reviews.llvm.org/D101630#2729573>, @tra wrote:


> CUDA compilation currently errors out if `-o` is used when more than one 
> output would be produced.
> E.g.
>
>   % bin/clang++ -x cuda --offload-arch=sm_60 --offload-arch=sm_70 
> --cuda-path=$HOME/local/cuda-10.2  zz.cu -c -E 
>   #... preprocessed output from host and 2 GPU compilations is printed out
>   
>   % bin/clang++ -x cuda --offload-arch=sm_60 --offload-arch=sm_70 
> --cuda-path=$HOME/local/cuda-10.2  zz.cu -c -E  -o foo.out
>   clang-13: error: cannot specify -o when generating multiple output files
>   
>   % bin/clang++ -x cuda --offload-arch=sm_60 --offload-arch=sm_70 
> --cuda-path=$HOME/local/cuda-10.2  zz.cu -c --cuda-device-only -E  -o foo.out
>   clang-13: error: cannot specify -o when generating multiple output files
>   
>   % bin/clang++ -x cuda --offload-arch=sm_60 --offload-arch=sm_70 
> --cuda-path=$HOME/local/cuda-10.2  zz.cu -c --cuda-device-only -S  -o foo.out
>   clang-13: error: cannot specify -o when generating multiple output files
>
> I think I've borrowed that behavior from some of the macos-related 
> functionality, so we do have a somewhat established model of how to handle 
> multiple outputs.
> Wrapping multiple outputs into a single bundle could be an option too.
>
> The question is -- what would make most sense.
> Are bundles useful in cases when the user would use options that give us 
> intermediate compiler outputs?
>
> In my experience, most of such use cases are intended for manual examination 
> of compiler output and as such I'd prefer to keep the results immediately 
> usable, without having to unbundle them. In such cases we're already changing 
> command line options and adjusting them to produce the output from the 
> specific sub-compilation I want is trivial. Having to unbundle things is more 
> complicated as the bundler/unbundler tool as it is right now is poorly 
> documented and is not particularly user-friendly. If it is to become a 
> user-facing tool like ar/nm/objdump, it would need some improvements.
>
> If you do have use cases when you do need to bundle intermediate results, are 
> they for the human consumption or for tooling? Perhaps we should make the 
> "bundle the outputs" behavior an controllable by a flag, and keep enforcing 
> "one output only" as the default.

We use ccache and need one output for -E with device compilation. Also there 
are use cases to emit bitcode for device compilation and link them later. These 
use cases require output to be bundled.

If users want to get the unbundled output, they can use -save-temps. Is it 
sufficient?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101630/new/

https://reviews.llvm.org/D101630

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D101630: [HIP] Fix device-only compilation

Reply via email to