tra added a subscriber: echristo.
tra added a comment.

In D101630#2777346 <https://reviews.llvm.org/D101630#2777346>, @yaxunl wrote:

> In D101630#2748513 <https://reviews.llvm.org/D101630#2748513>, @tra wrote:
>
>> How about this:
>> If the user explicitly specified `--cuda-host-only` or `--cuda-device-only`, 
>> then by default only allow producing the natural output format, unless a 
>> bundled output is requested by an option. This should keep existing users 
>> working.
>> If the compilation is done without explicitly requested sub-compilation(s), 
>> then bundle the output by default. This should keep the GPU-unaware tools 
>> like ccache happy as they would always get the single output they expect.
>>
>> WDYT?
>
> `--cuda-host-only` always have one output, therefore there is no point of 
> bundle its output. We only need to decide the proper behavior of 
> `--cuda-device-only`.

It still fits my proposal of requiring a single sub-compilation and not 
bundling the output.
The point was that such behavior is consistent regardless of whether we're 
compiling CUDA or HIP for the host or for device.

> How about keeping the original default behavior of not bundling if users do 
> not specify output file, 
> whereas bundle the output if users specifying output file.

I think it will make things worse. Compiler output should not change depending 
on whether `-o` is used.

> Since specifying output file indicates users  requesting one output. 
> -f[no-]hip-bundle-device-output override the default behavior.

I disagree. When user specifies the output, the intent is to specify the 
**location** of the outputs, not its contents or format.

Telling compiler to produce a different output format should not depend on 
specifying (or not) the output location.

I think our options are:

- Always bundle --cuda-device-only outputs by default. This is consistent for 
HIP compilation, but deviates from CUDA, which can't do bundling. Also, 
single-target subcompilation deviates from both CUDA and regular C++ 
compilation, which is what most users would be familiar with and which would 
probably be the most sensible default for a single sub-compilation. It can be 
overridden with an option, but it goes against the principle that it's 
specialized use case that should need extra options. The most common use case 
should not need them.

- Only bundle multiple sub-compilations' output by default. This would preserve 
the sensible single sub-compilation behavior. The downside is that it makes the 
output format depend on whether compiler ends up doing one or many 
sub-compilations. E.g. `--offload-arch=A -S` would produce ASM and 
`--offload-arch=A --offload-arch=B -S` would produce a bundle. If the user 
can't control some of the compiler options, Such approach would make output 
format unpredictable. E.g. passing `--offload-arch=foo` to compiler on godbolt 
would all of a sudden produce bundled output instead of assembly text or a 
sensible error message that you're trying to produce multiple outputs.

- Keep the current behavior (insist on single sub-compilation) as the default, 
allow overriding it for HIP with the flag. IMO that's the most consistent 
option and I still think it's the one most suitable to keep as the default.

I can see the benefit of always bundling for HIP, but I also believe that 
keeping things simple, consistent and predictable is important. Considering 
that we're tinkering in a relatively obscure niche of the compiler, it probably 
does not matter all that much, but it should not stop us from trying to figure 
out the best approach in a principled way.

I think we could benefit from a second opinion on which approach would make 
more sense for clang. 
Summoning @jdoerfert and @echristo.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101630/new/

https://reviews.llvm.org/D101630

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to