[PATCH] D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain

Samuel Antao via Phabricator via cfe-commits Tue, 31 Jul 2018 04:19:47 -0700

sfantao added a comment.

Hi Doru,


Thanks for updating the patch. I've a few comments below.



================
Comment at: include/clang/Driver/Compilation.h:312
+  /// \param skipBundler - bool value set once by the driver.
+  void setSkipOffloadBundler(bool skipBundler);
+
----------------
gtbercea wrote:
> sfantao wrote:
> > Why is this a property of the compilation and not of a set of actions 
> > referring to a given target? That would allow one to combine in the same 
> > compilation targets requiring the bundler and targets that wouldn't. 
> This was a way to pass this information to the OpenMP NVPTX device toolchain.
> 
> Both the Driver OpenMP NVPTX toolchain need to agree on the usage of the new 
> scheme (proposed in this patch) or the old scheme (the one that is in the 
> compiler today).
> 
> 
I understand, but the way I see it is that it is the toolchain that skips the 
bundler not the compilation. I understand that as of this patch, you skip only 
if there is a single nvptx target. If you have more than one target, as some 
tests do, some toolchains will still need the bundler. So, we are making what 
happens with the nvptx target dependent of other toolchains. Is this an 
intended effect of this patch?


================
Comment at: lib/Driver/Compilation.cpp:276
+void Compilation::setSkipOffloadBundler(bool skipBundler) {
+  skipOffloadBundler = skipBundler;
+}
----------------
gtbercea wrote:
> sfantao wrote:
> > Given the logic you have below, you are assuming this is not set to false 
> > ever. It would be wise to get an assertion here in case you end up having 
> > toolchains skipping and others don't. If that is just not supported a 
> > diagnostic should be added instead.
> > 
> > The convention is that local variables use CamelCase.
> The checks I added in the Driver will set this flag to true if all toolchains 
> Clang offloads to support the skipping of the bundler/unbundler for object 
> files. Currently only NVPTX toolchain can skip the bundler/unbundler for 
> object files so the code path in this patch will be taken only for:
> 
> -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda
Ok, if that is the case, just add an assertion here.


================
Comment at: lib/Driver/Driver.cpp:2943
+    }
+  }
+
----------------
gtbercea wrote:
> sfantao wrote:
> > Can you just implement this check in the definition of `Compilation: 
> > canSkipClangOffloadBundler` and get rid of `setSkipOffloadBundler`? All the 
> > requirted information is already in `Compilation` under `C.getInputArgs()`.
> The driver needs to have the result of this check available. The flag is 
> passed to the step which adds host-device dependencies. If the bundler can be 
> skipped then the unbundling action is not required.
> 
> I guess this could be implemented in Compilation. Even so I would like it to 
> happen only once like it does here and not every time someone queries the 
> "can I skip the bundler" flag.
> 
> I wanted this check to happen only once hence why I put in on the driver 
> side. The result of this check needs to be available in Driver.cpp and in 
> Cuda.cpp files (see usage in this patch). Compilation keeps track of the flag 
> because skipping the bundler is an all or nothing flag: you can skip the 
> bundler/unbundler for object files if and only if all toolchains you are 
> offloading to can skip it.
> 
Right, in these circumstances "can skip bundler" is the same as "do I have a 
single toolchain" and "is that toolchain nvptx". This is fairly inexpensive to 
do, so I don't really see the need to record this state in the driver. It will 
also be clearer what are the conditions for which you skip the bundler.


================
Comment at: lib/Driver/ToolChains/Cuda.cpp:496
+              ? CudaVirtualArchToString(VirtualArchForCudaArch(gpu_arch))
+              : GPUArch.str().c_str();
+      const char *PtxF =
----------------
Why don't create fatbins instead of cubins in all cases. For the purposes of 
OpenMP they are equivalent, i.e. nvlink can interpret them in the same way, no?


================
Comment at: lib/Driver/ToolChains/Cuda.cpp:517
+    //   --image=profile=sm_35@compute_35,file=ompprint.compute_35.sm_35.cubin
+    //   --embedded-fatbin=ompprint.fatbin.c --cuda --device-c
+    const char *Exec = Args.MakeArgString(TC.GetProgramPath("fatbinary"));
----------------
I'd move this comment to the top of this session so that we know what is going 
on in the code above.


================
Comment at: lib/Driver/ToolChains/Cuda.cpp:524
+    // the hash of the full path of the file.
+    std::hash<std::string> hash_fn;
+    size_t hash = hash_fn(llvm::sys::path::filename(Output.getFilename()));
----------------
CamelCase


================
Comment at: lib/Driver/ToolChains/Cuda.cpp:664
+      // Anything that's not a file name is potentially a static library
+      // so treat it as such.
+      if (C.canSkipOffloadBundler())
----------------
So, what if it is not a static library?


================
Comment at: test/Driver/openmp-offload-gpu-linux.c:24
+// CHK-PTXAS-CUBIN-BUNDLING: clang++{{.*}}" "-c" "-o" 
"[[HOSTDEV:.*\.o]]"{{.*}}" "[[FATBINC]]" "-D__NV_MODULE_ID=
+// CHK-PTXAS-CUBIN-BUNDLING-NOT: clang-offload-bundler{{.*}}" "-type=o" 
{{.*}}"-inputs={{.*}}[[CUBIN]]
+// CHK-PTXAS-CUBIN-BUNDLING: ld" "-r" "[[HOSTDEV]]" "{{.*}}.o" "-o" "{{.*}}.o"
----------------
`clang-offload-bundler` should be sufficient here.


Repository:
  rC Clang

https://reviews.llvm.org/D47394



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain

Reply via email to