yaxunl added a comment. In D101630#2729975 <https://reviews.llvm.org/D101630#2729975>, @tra wrote:
> What will happen with this patch in the following scenarios: > > - `--offload_arch=A -S -o out.s` > - `--offload_arch=A --offload-arch=B -S -o out.s` > > I would expect the first case to produce a plain text assembly file. With > this patch the second case will produce a bundle. With some build tools users > only add to the various compiler options provided by the system. Depending on > whether those system-provided options include an `--offload-arch`, the format > of the output in the first example becomes unstable. So the consistent way > would be to always bundle everything, but that breaks (or at least > complicates) the normal single-output case and makes it deviate from what > users expect from a regular C++ compilation. > > In D101630#2729768 <https://reviews.llvm.org/D101630#2729768>, @yaxunl wrote: > >> We use ccache and need one output for -E with device compilation. Also there >> are use cases to emit bitcode for device compilation and link them later. >> These use cases require output to be bundled. > > This is a good point. I don't think I've ever used ccache on a CUDA > compilation, but I see how ccache may get surprised. > > Considering the scenario above, I think a better way to handle it would be to > teach ccache about CUDA/HIP compilation. It's a similar situation with > support for split DWARF, when compiler does something beyond the expected > one-input to one-output transformation. > E.g. we could tell it to use stdout for `-E`. Or implement the > `bundle-everything` flag in clang and let ccache use it if it needs to have a > single output. > >> If users want to get the unbundled output, they can use -save-temps. Is it >> sufficient? > > In terms of saving intermediate outputs - yes. In terms of usability - no. > Sometimes I want one particular intermediate result saved with exact filename > (or piped to stdout) and saving bunch and then picking one would be a pretty > annoying usability regression for me. How about an option -fhip-bundle-device-output. If it is on, device output is bundled no matter how many GPU arch there are. By default it is on. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D101630/new/ https://reviews.llvm.org/D101630 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits