[PATCH] D157750: Properly handle -fsplit-machine-functions for fatbinary compilation

Fangrui Song via Phabricator via cfe-commits Mon, 21 Aug 2023 10:22:30 -0700

MaskRay added inline comments.


================
Comment at: clang/test/Driver/fsplit-machine-functions-with-cuda-nvptx.c:9
+
+// Check that -fsplit-machine-functions is passed to both x86 and cuda 
compilation and does not cause driver error.
+// MFS2: -fsplit-machine-functions
----------------
shenhan wrote:
> MaskRay wrote:
> > MaskRay wrote:
> > > tra wrote:
> > > > shenhan wrote:
> > > > > tra wrote:
> > > > > > shenhan wrote:
> > > > > > > tra wrote:
> > > > > > > > We will still see a warning, right? So, for someone compiling 
> > > > > > > > with `-Werror` that's going to be a problem.
> > > > > > > > 
> > > > > > > > Also, if the warning is issued from the top-level driver, we 
> > > > > > > > may not even be able to suppress it when we disable splitting 
> > > > > > > > on GPU side with `-Xarch_device -fno-split-machine-functions`.
> > > > > > > > 
> > > > > > > > 
> > > > > > > > We will still see a warning, right?
> > > > > > > Yes, there still will be a warning. We've discussed it and we 
> > > > > > > think that pass -fsplit-machine-functions in this case is not a 
> > > > > > > proper usage and a warning is warranted, and it is not good that 
> > > > > > > skip doing split silently while uses explicitly ask for it.
> > > > > > > 
> > > > > > > > Also, if the warning is issued from the top-level driver
> > > > > > > The warning will not be issued from the top-level driver, it will 
> > > > > > > be issued when configuring optimization passes.
> > > > > > > So:
> > > > > > > 
> > > > > > > 
> > > > > > >   - -fsplit-machine-functions -Xarch_device 
> > > > > > > -fno-split-machine-functions
> > > > > > > Will enable MFS for host, disable MFS for gpus and without any 
> > > > > > > warnings.
> > > > > > > 
> > > > > > >   - -Xarch_host -fsplit-machine-functions
> > > > > > > The same as the above
> > > > > > > 
> > > > > > >   - -Xarch_host -fsplit-machine-functions -Xarch_device 
> > > > > > > -fno-split-machine-functions
> > > > > > > The same as the above
> > > > > > > 
> > > > > > > We've discussed it and we think that pass 
> > > > > > > -fsplit-machine-functions in this case is not a proper usage and 
> > > > > > > a warning is warranted, and it is not good that skip doing split 
> > > > > > > silently while uses explicitly ask for it.
> > > > > > 
> > > > > > I would agree with that assertion if we were talking exclusively 
> > > > > > about CUDA compilation.
> > > > > > However, a common real world use pattern is that the flags are set 
> > > > > > globally for all C++ compilations, and then CUDA compilations 
> > > > > > within the project need to do whatever they need to to keep things 
> > > > > > working. The original user intent was for the option to affect the 
> > > > > > host compilation. There's no inherent assumption that it will do 
> > > > > > anything useful for the GPU.
> > > > > > 
> > > > > > In number of similar cases in the past we did settle on silently 
> > > > > > ignoring some top-level flags that we do expect to encounter in 
> > > > > > real projects, but which made no sense for the GPU. E.g. 
> > > > > > sanitizers. If the project is built w/ sanitizer enabled, the idea 
> > > > > > is to sanitize the host code, The GPU code continues to be built 
> > > > > > w/o sanitizer enabled. 
> > > > > > 
> > > > > > Anyways, as long as we have a way to deal with it it's not a big 
> > > > > > deal one way or another.
> > > > > > 
> > > > > > > -fsplit-machine-functions -Xarch_device 
> > > > > > > -fno-split-machine-functions
> > > > > > > Will enable MFS for host, disable MFS for gpus and without any 
> > > > > > > warnings.
> > > > > > 
> > > > > > OK. This will work.
> > > > > > 
> > > > > > 
> > > > > > In number of similar cases in the past we did settle on silently 
> > > > > > ignoring some top-level flags that we do expect to encounter in 
> > > > > > real projects, but which made no sense for the GPU. E.g. 
> > > > > > sanitizers. If the project is built w/ sanitizer enabled, the idea 
> > > > > > is to sanitize the host code, The GPU code continues to be built 
> > > > > > w/o sanitizer enabled.
> > > > > 
> > > > > Can I understand it this way - if the compiler is **only** building 
> > > > > for CPUs, then silently ignore any optimization flags is not a good 
> > > > > behavior. If the compiler is building CPUs and GPUs, it is still not 
> > > > > a good behavior to silently ignore optimization flags for CPUs, but 
> > > > > it is probably ok to silently ignore optimization flags for GPUs.
> > > > > 
> > > > > > OK. This will work.
> > > > > Thanks for confirming.
> > > > >  it is probably ok to silently ignore optimization flags for GPUs.
> > > > 
> > > > In this case, yes. 
> > > > 
> > > > I think the most consistent way to handle the situation is to keep the 
> > > > warning in place at cc1 compiler level, but change the driver behavior 
> > > > (and document it) so that it does not pass the splitting options to 
> > > > offloading sub-compilations. This way we'll do the sensible thing for 
> > > > the most common use case, yet would still warn if the user tries to 
> > > > enable the splitting where they should not (e.g. by using `-Xclang 
> > > > -fsplit-machine-functions` during CUDA compilation)
> > > > 
> > > > 
> > > > 
> > > > 
> > > There are excessive spaces before `%clang`. We should keep just one 
> > > space: `RUN: %clang`
> > I agree with @tra's analysis. Either do nothing on Clang side and requiring 
> > `-fsplit-machine-functions -Xarch_device -fno-split-machine-functions` or 
> > ignoring the option when creating a device job works for me.
> > 
> > This patch changed the behavior in an unintended direction.
> > Either do nothing on Clang side and requiring -fsplit-machine-functions 
> > -Xarch_device -fno-split-machine-functions or ignoring the option when 
> > creating a device job works for me.
> > This patch changed the behavior in an unintended direction.
> 
> Thanks Ray. Just a little bit confused, what this patch does is indeed 
> "requiring -fsplit-machine-functions -Xarch_device 
> -fno-split-machine-functions", before this patch, this usage will cause an 
> error.
> 
> What do you suggest?
I think we should go back to the state before this patch. Do either

(a) require `-Xarch_device -fno-split-machine-functions` for `--cuda-gpu-arch` 
users
(b) ignore `-fsplit-machine-functions` when creating a `-triple" 
"nvptx64-nvidia-cuda` job in the driver.

Personally I'd prefer (a), as I don't want to maintain a long list of ignored 
options in the driver for certain very specific optimization options, but I am 
fine with (b). In either case, I think we need to revert this patch to go to a 
clean state.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157750/new/

https://reviews.llvm.org/D157750

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D157750: Properly handle -fsplit-machine-functions for fatbinary compilation

Reply via email to