https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108098
--- Comment #4 from Tobias Burnus <burnus at gcc dot gnu.org> --- Indeed, the following seems to also help with an older CUDA / JIT compiler. Motivated by Thomas' work. If we are sure that CUDA 11.0 fixes it, we could generate that code only for: if (version2[0] < 7 || sm_ver2[0] < 8) given that sm_80 is only supported since CUDA 11.0 and, likewise, CUDA 11.0 introduces PTX ISA version 7.0. --- a/gcc/config/nvptx/mkoffload.cc +++ b/gcc/config/nvptx/mkoffload.cc @@ -358,4 +358,9 @@ process (FILE *in, FILE *out, uint32_t omp_requires) fprintf (out, "\"\n\t\".file 1 \\\"<dummy>\\\"\"\n"); + fprintf (out, "\n\t\".func __dummy$func ( );\"\n"); + fprintf (out, "\t\".func __dummy$func ( )\"\n"); + fprintf (out, "\t\"{\"\n"); + fprintf (out, "\t\"}\"\n"); + size_t fidx = 0; for (id = func_ids; id; id = id->next)