https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108098

--- Comment #4 from Tobias Burnus <burnus at gcc dot gnu.org> ---
Indeed, the following seems to also help with an older CUDA / JIT compiler.
Motivated by Thomas' work.

If we are sure that CUDA 11.0 fixes it, we could generate that code only for:

  if (version2[0] < 7 || sm_ver2[0] < 8)

given that sm_80 is only supported since CUDA 11.0 and, likewise, CUDA 11.0
introduces PTX ISA version 7.0.

--- a/gcc/config/nvptx/mkoffload.cc
+++ b/gcc/config/nvptx/mkoffload.cc
@@ -358,4 +358,9 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
       fprintf (out, "\"\n\t\".file 1 \\\"<dummy>\\\"\"\n");

+      fprintf (out, "\n\t\".func __dummy$func ( );\"\n");
+      fprintf (out, "\t\".func __dummy$func ( )\"\n");
+      fprintf (out, "\t\"{\"\n");
+      fprintf (out, "\t\"}\"\n");
+
       size_t fidx = 0;
       for (id = func_ids; id; id = id->next)

Reply via email to