Re: [PATCH] D16082: [CUDA] Invoke ptxas and fatbinary during compilation.

Justin Lebar via cfe-commits Mon, 11 Jan 2016 16:56:28 -0800

jlebar added a comment.

In http://reviews.llvm.org/D16082#324138, @tra wrote:


> Make sure it works with -save-temps and -fintegrated-as/-fno-integrated-as. 
> They tend to throw wrenches into pipeline construction.


Thanks.  All of them worked except -fintegrated-as, which was causing us not to 
invoke ptxas.  Fixed, and added a test.


================
Comment at: lib/Driver/Driver.cpp:1380
@@ +1379,3 @@
+      C.MakeAction<LinkJobAction>(DeviceActions, types::TY_CUDA_FATBIN),
+      /* GpuArchName = */ nullptr,
+      /* AtTopLevel = */ false);
----------------
tra wrote:
> So, you're treating GpuArchName==nullptr as a special case of DeviceAction 
> for fatbin?
> Perhaps it warrants its own CudaDeviceLinkAction?
It's a special case either way, but there are enough places that look for a 
CudaDeviceAction that I *think* this special case is cleaner than having a 
bunch of if (CudaDeviceAction || CudaDeviceLinkAction)s floating around.

================
Comment at: lib/Driver/Driver.cpp:1620-1625
@@ -1602,3 +1619,8 @@
   case phases::Assemble:
-    return C.MakeAction<AssembleJobAction>(Input, types::TY_Object);
+    // Assembling generates an object file, except when we're assembling CUDA,
+    // in which case we get a cubin file.
+    return C.MakeAction<AssembleJobAction>(
+        std::move(Input), TC.getTriple().getOS() == llvm::Triple::CUDA
+                              ? types::TY_CUDA_CUBIN
+                              : types::TY_Object);
   }
----------------
tra wrote:
> cubin *is* an ELF object file. Do we really need a new type here or can we 
> get by with TY_Object?
Got rid of the extra type.  Note that this means that our intermediate objs 
will be named foo.o instead of foo.cubin, which isn't optimal, but I agree that 
it's simpler without the extra type.

================
Comment at: lib/Driver/Driver.cpp:1880
@@ +1879,3 @@
+    // retrieve the Input's GPU arch.
+    II.setAction(A);
+    return II;
----------------
echristo wrote:
> Weren't you just adding it as part of the InputInfo constructor in 16078?
Yes, but we're doing something slightly different here.  Suppose you have a 
CudaDeviceAction --> BackendAction.  What do you want the InputInfo's Action to 
be?  Without this change, it's the BackendAction.  But we really want it to be 
the CudaDeviceAction.  Thus this line.

I updated the comment in an attempt to clarify.

================
Comment at: lib/Driver/ToolChains.cpp:4230
@@ -4224,3 +4229,3 @@
       // Skip this argument unless the architecture matches BoundArch
-      if (A->getValue(0) != StringRef(BoundArch))
+      if (!BoundArch || A->getValue(0) != StringRef(BoundArch))
         continue;
----------------
tra wrote:
> You may as well move it out of the loop and return early if BoundArch is 
> nullptr.
Tried this and discovered it's actually subtly different in a way that, 
thankfully, affects one of the tests I added.

if (!BoundArch) continue applies only if 
A->getOption().matches(options::OPT_Xarch__).  So we can't hoist it into an 
early return.

================
Comment at: lib/Driver/Tools.cpp:10677-10678
@@ +10676,4 @@
+    std::string Arch = A->getGpuArchName();
+    // We need to name pass an Arch of the form "sm_XX" for cubin files and
+    // "compute_XX" for ptx.  CudaDeviceAction's getGpuArchName() is guaranteed
+    // to match "sm_XX".
----------------
tra wrote:
> First line does not parse.
> 
> In general compute_XX does not necessarily match sm_XX.
> [[ http://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/#ptxas-options | 
> ptxas options ]] says that 
> 
> > Allowed values for this option: compute_20, compute_30, compute_35, 
> > compute_50, compute_52; and sm_20, sm_21, sm_30, sm_32, sm_35, sm_50 and 
> > sm_52
> 
> Note that there's no compute_21, compute_32. You'll need sm_XX -> compute_YY 
> map.
> 
> 
Oh goodness, how awful.  Thank you for catching that.  Fixed.


http://reviews.llvm.org/D16082



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D16082: [CUDA] Invoke ptxas and fatbinary during compilation.

Reply via email to