[PATCH] D21419: [CUDA] Don't pass top-level -march down to device cc1 or ptxas.

Justin Lebar via cfe-commits Wed, 15 Jun 2016 16:24:48 -0700

jlebar created this revision.
jlebar added a reviewer: tra.
jlebar added subscribers: echristo, cfe-commits.


Previously if you did e.g.

  $ clang -march=haswell -x cuda foo.cu

we would pass "-march=haswell -march=sm_20" down to the ptxas tool.
This causes it to assert, and rightly so!

http://reviews.llvm.org/D21419

Files:
  lib/Driver/ToolChains.cpp
  test/Driver/cuda-march.cu

Index: test/Driver/cuda-march.cu
===================================================================
--- /dev/null
+++ test/Driver/cuda-march.cu
@@ -0,0 +1,38 @@
+// Checks that cuda compilation does the right thing when passed -march.
+// (Specifically, we want to pass it to host compilation, but not to device
+// compilation or ptxas!)
+//
+// REQUIRES: clang-driver
+// REQUIRES: x86-registered-target
+// REQUIRES: nvptx-registered-target
+
+// RUN: %clang -### -target x86_64-linux-gnu -c -march=haswell %s 2>&1 | \
+// RUN: FileCheck -check-prefix HASWELL -check-prefix SM20 %s
+
+// RUN: %clang -### -target x86_64-linux-gnu -c -march=haswell 
--cuda-gpu-arch=sm_20 %s 2>&1 | \
+// RUN: FileCheck -check-prefix HASWELL -check-prefix SM20 %s
+
+// RUN: %clang -### -target x86_64-linux-gnu -c -march=skylake 
--cuda-gpu-arch=sm_30 %s 2>&1 | \
+// RUN: FileCheck -check-prefix SKYLAKE -check-prefix SM30 %s
+
+// SM20:clang
+// SM20: "-cc1"
+// SM20-SAME: "-triple" "nvptx
+// SM20-SAME: "-target-cpu" "sm_20"
+// SM20: ptxas
+// SM20-SAME: "--gpu-name" "sm_20"
+
+// SM30:clang
+// SM30: "-cc1"
+// SM30-SAME: "-triple" "nvptx
+// SM30-SAME: "-target-cpu" "sm_30"
+// SM30: ptxas
+// SM30-SAME: "--gpu-name" "sm_30"
+
+// HASWELL:clang
+// HASWELL-SAME: "-cc1"
+// HASWELL-SAME: "-target-cpu" "haswell"
+
+// SKYLAKE:clang
+// SKYLAKE-SAME: "-cc1"
+// SKYLAKE-SAME: "-target-cpu" "skylake"
Index: lib/Driver/ToolChains.cpp
===================================================================
--- lib/Driver/ToolChains.cpp
+++ lib/Driver/ToolChains.cpp
@@ -4676,8 +4676,10 @@
     DAL->append(A);
   }
 
-  if (BoundArch)
+  if (BoundArch) {
+    DAL->eraseArg(options::OPT_march_EQ);
     DAL->AddJoinedArg(nullptr, Opts.getOption(options::OPT_march_EQ), 
BoundArch);
+  }
   return DAL;
 }

Index: test/Driver/cuda-march.cu
===================================================================
--- /dev/null
+++ test/Driver/cuda-march.cu
@@ -0,0 +1,38 @@
+// Checks that cuda compilation does the right thing when passed -march.
+// (Specifically, we want to pass it to host compilation, but not to device
+// compilation or ptxas!)
+//
+// REQUIRES: clang-driver
+// REQUIRES: x86-registered-target
+// REQUIRES: nvptx-registered-target
+
+// RUN: %clang -### -target x86_64-linux-gnu -c -march=haswell %s 2>&1 | \
+// RUN: FileCheck -check-prefix HASWELL -check-prefix SM20 %s
+
+// RUN: %clang -### -target x86_64-linux-gnu -c -march=haswell --cuda-gpu-arch=sm_20 %s 2>&1 | \
+// RUN: FileCheck -check-prefix HASWELL -check-prefix SM20 %s
+
+// RUN: %clang -### -target x86_64-linux-gnu -c -march=skylake --cuda-gpu-arch=sm_30 %s 2>&1 | \
+// RUN: FileCheck -check-prefix SKYLAKE -check-prefix SM30 %s
+
+// SM20:clang
+// SM20: "-cc1"
+// SM20-SAME: "-triple" "nvptx
+// SM20-SAME: "-target-cpu" "sm_20"
+// SM20: ptxas
+// SM20-SAME: "--gpu-name" "sm_20"
+
+// SM30:clang
+// SM30: "-cc1"
+// SM30-SAME: "-triple" "nvptx
+// SM30-SAME: "-target-cpu" "sm_30"
+// SM30: ptxas
+// SM30-SAME: "--gpu-name" "sm_30"
+
+// HASWELL:clang
+// HASWELL-SAME: "-cc1"
+// HASWELL-SAME: "-target-cpu" "haswell"
+
+// SKYLAKE:clang
+// SKYLAKE-SAME: "-cc1"
+// SKYLAKE-SAME: "-target-cpu" "skylake"
Index: lib/Driver/ToolChains.cpp
===================================================================
--- lib/Driver/ToolChains.cpp
+++ lib/Driver/ToolChains.cpp
@@ -4676,8 +4676,10 @@
     DAL->append(A);
   }
 
-  if (BoundArch)
+  if (BoundArch) {
+    DAL->eraseArg(options::OPT_march_EQ);
     DAL->AddJoinedArg(nullptr, Opts.getOption(options::OPT_march_EQ), BoundArch);
+  }
   return DAL;
 }

_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D21419: [CUDA] Don't pass top-level -march down to device cc1 or ptxas.

Reply via email to