https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81802

            Bug ID: 81802
           Summary: Report cuLaunchKernel launch dimensions in
                    GOMP_OFFLOAD_run
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libgomp
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vries at gcc dot gnu.org
                CC: jakub at gcc dot gnu.org
  Target Milestone: ---

This patch:
...
diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index f5b9502..2cb63b4 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -2457,6 +2457,7 @@ nvptx_stacks_free (void *p, int num)
 void
 GOMP_OFFLOAD_run (int ord, void *tgt_fn, void *tgt_vars, void **args)
 {
+  const struct targ_fn_launch *launch = ((struct targ_fn_descriptor *)
tgt_fn)->launch;
   CUfunction function = ((struct targ_fn_descriptor *) tgt_fn)->fn;
   CUresult r;
   struct ptx_device *ptx_dev = ptx_devices[ord];
@@ -2492,6 +2493,9 @@ GOMP_OFFLOAD_run (int ord, void *tgt_fn, void *tgt_vars,
void **args)
     CU_LAUNCH_PARAM_BUFFER_SIZE, &fn_args_size,
     CU_LAUNCH_PARAM_END
   };
+  GOMP_PLUGIN_debug (0, "  %s: kernel %s: launch"
+                    " [(teams: %u), 1, 1] [32, (threads: %u), 1]\n",
+                    __FUNCTION__, launch->fn, teams, threads);
   r = CUDA_CALL_NOCHECK (cuLaunchKernel, function, teams, 1, 1,
                         32, threads, 1, 0, ptx_dev->null_stream->stream,
                         NULL, config);
...

Prints this information with GOMP_DEBUG=1:
...
  GOMP_OFFLOAD_run: kernel f2_tpf_static32$_omp_fn$0: launch [(teams: 1), 1, 1]
[32, (threads: 8), 1]
...

I'm not happy with the term 'threads', the term is ambiguous in the context.

Perhaps 'warps', or 'omp-threads' would be better.

Reply via email to