[PATCH] D157750: Properly handle -fsplit-machine-functions for fatbinary compilation

Artem Belevich via Phabricator via cfe-commits Mon, 21 Aug 2023 10:45:18 -0700

tra added inline comments.


================
Comment at: clang/test/Driver/fsplit-machine-functions-with-cuda-nvptx.c:9
+
+// Check that -fsplit-machine-functions is passed to both x86 and cuda 
compilation and does not cause driver error.
+// MFS2: -fsplit-machine-functions
----------------
MaskRay wrote:
> shenhan wrote:
> > MaskRay wrote:
> > > MaskRay wrote:
> > > > tra wrote:
> > > > > shenhan wrote:
> > > > > > tra wrote:
> > > > > > > shenhan wrote:
> > > > > > > > tra wrote:
> > > > > > > > > We will still see a warning, right? So, for someone compiling 
> > > > > > > > > with `-Werror` that's going to be a problem.
> > > > > > > > > 
> > > > > > > > > Also, if the warning is issued from the top-level driver, we 
> > > > > > > > > may not even be able to suppress it when we disable splitting 
> > > > > > > > > on GPU side with `-Xarch_device -fno-split-machine-functions`.
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > We will still see a warning, right?
> > > > > > > > Yes, there still will be a warning. We've discussed it and we 
> > > > > > > > think that pass -fsplit-machine-functions in this case is not a 
> > > > > > > > proper usage and a warning is warranted, and it is not good 
> > > > > > > > that skip doing split silently while uses explicitly ask for it.
> > > > > > > > 
> > > > > > > > > Also, if the warning is issued from the top-level driver
> > > > > > > > The warning will not be issued from the top-level driver, it 
> > > > > > > > will be issued when configuring optimization passes.
> > > > > > > > So:
> > > > > > > > 
> > > > > > > > 
> > > > > > > >   - -fsplit-machine-functions -Xarch_device 
> > > > > > > > -fno-split-machine-functions
> > > > > > > > Will enable MFS for host, disable MFS for gpus and without any 
> > > > > > > > warnings.
> > > > > > > > 
> > > > > > > >   - -Xarch_host -fsplit-machine-functions
> > > > > > > > The same as the above
> > > > > > > > 
> > > > > > > >   - -Xarch_host -fsplit-machine-functions -Xarch_device 
> > > > > > > > -fno-split-machine-functions
> > > > > > > > The same as the above
> > > > > > > > 
> > > > > > > > We've discussed it and we think that pass 
> > > > > > > > -fsplit-machine-functions in this case is not a proper usage 
> > > > > > > > and a warning is warranted, and it is not good that skip doing 
> > > > > > > > split silently while uses explicitly ask for it.
> > > > > > > 
> > > > > > > I would agree with that assertion if we were talking exclusively 
> > > > > > > about CUDA compilation.
> > > > > > > However, a common real world use pattern is that the flags are 
> > > > > > > set globally for all C++ compilations, and then CUDA compilations 
> > > > > > > within the project need to do whatever they need to to keep 
> > > > > > > things working. The original user intent was for the option to 
> > > > > > > affect the host compilation. There's no inherent assumption that 
> > > > > > > it will do anything useful for the GPU.
> > > > > > > 
> > > > > > > In number of similar cases in the past we did settle on silently 
> > > > > > > ignoring some top-level flags that we do expect to encounter in 
> > > > > > > real projects, but which made no sense for the GPU. E.g. 
> > > > > > > sanitizers. If the project is built w/ sanitizer enabled, the 
> > > > > > > idea is to sanitize the host code, The GPU code continues to be 
> > > > > > > built w/o sanitizer enabled. 
> > > > > > > 
> > > > > > > Anyways, as long as we have a way to deal with it it's not a big 
> > > > > > > deal one way or another.
> > > > > > > 
> > > > > > > > -fsplit-machine-functions -Xarch_device 
> > > > > > > > -fno-split-machine-functions
> > > > > > > > Will enable MFS for host, disable MFS for gpus and without any 
> > > > > > > > warnings.
> > > > > > > 
> > > > > > > OK. This will work.
> > > > > > > 
> > > > > > > 
> > > > > > > In number of similar cases in the past we did settle on silently 
> > > > > > > ignoring some top-level flags that we do expect to encounter in 
> > > > > > > real projects, but which made no sense for the GPU. E.g. 
> > > > > > > sanitizers. If the project is built w/ sanitizer enabled, the 
> > > > > > > idea is to sanitize the host code, The GPU code continues to be 
> > > > > > > built w/o sanitizer enabled.
> > > > > > 
> > > > > > Can I understand it this way - if the compiler is **only** building 
> > > > > > for CPUs, then silently ignore any optimization flags is not a good 
> > > > > > behavior. If the compiler is building CPUs and GPUs, it is still 
> > > > > > not a good behavior to silently ignore optimization flags for CPUs, 
> > > > > > but it is probably ok to silently ignore optimization flags for 
> > > > > > GPUs.
> > > > > > 
> > > > > > > OK. This will work.
> > > > > > Thanks for confirming.
> > > > > >  it is probably ok to silently ignore optimization flags for GPUs.
> > > > > 
> > > > > In this case, yes. 
> > > > > 
> > > > > I think the most consistent way to handle the situation is to keep 
> > > > > the warning in place at cc1 compiler level, but change the driver 
> > > > > behavior (and document it) so that it does not pass the splitting 
> > > > > options to offloading sub-compilations. This way we'll do the 
> > > > > sensible thing for the most common use case, yet would still warn if 
> > > > > the user tries to enable the splitting where they should not (e.g. by 
> > > > > using `-Xclang -fsplit-machine-functions` during CUDA compilation)
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > There are excessive spaces before `%clang`. We should keep just one 
> > > > space: `RUN: %clang`
> > > I agree with @tra's analysis. Either do nothing on Clang side and 
> > > requiring `-fsplit-machine-functions -Xarch_device 
> > > -fno-split-machine-functions` or ignoring the option when creating a 
> > > device job works for me.
> > > 
> > > This patch changed the behavior in an unintended direction.
> > > Either do nothing on Clang side and requiring -fsplit-machine-functions 
> > > -Xarch_device -fno-split-machine-functions or ignoring the option when 
> > > creating a device job works for me.
> > > This patch changed the behavior in an unintended direction.
> > 
> > Thanks Ray. Just a little bit confused, what this patch does is indeed 
> > "requiring -fsplit-machine-functions -Xarch_device 
> > -fno-split-machine-functions", before this patch, this usage will cause an 
> > error.
> > 
> > What do you suggest?
> I think we should go back to the state before this patch. Do either
> 
> (a) require `-Xarch_device -fno-split-machine-functions` for 
> `--cuda-gpu-arch` users
> (b) ignore `-fsplit-machine-functions` when creating a `-triple" 
> "nvptx64-nvidia-cuda` job in the driver.
> 
> Personally I'd prefer (a), as I don't want to maintain a long list of ignored 
> options in the driver for certain very specific optimization options, but I 
> am fine with (b). In either case, I think we need to revert this patch to go 
> to a clean state.
> a) require -Xarch_device -fno-split-machine-functions for --cuda-gpu-arch 
> users

It will break the currently valid compilation options and CUDA users would have 
no other viable option other than keep spamming their build flags with 
`-Xarch_device -fno-split-machine-functions`. E.g. if the flag is set 
build-wide on all compilations, it will break the previously working CUDA 
compilations due to no fault on the user side. Yes, `-Xarch_device 
-fno-split-machine-functions` will work around it, but it will have to be done 
by everyone who ever faces `-fsplit-machine-functions` in a CUDA compilation.

If we have to apply the workaround in 100% of the use cases, we should not 
require every user to do it.

> (b) ignore -fsplit-machine-functions when creating a -triple" 
> "nvptx64-nvidia-cuda job in the driver.

I believe that's the way to go. We already do that in a number of other cases 
like sanitizers, or other things not supported on the GPU side in principle.



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157750/new/

https://reviews.llvm.org/D157750

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D157750: Properly handle -fsplit-machine-functions for fatbinary compilation

Reply via email to