https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88756
Bug ID: 88756 Summary: [nvptx, openacc] Override too many num_workers in nvptx plugin, instead of erroring out Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: vries at gcc dot gnu.org Target Milestone: --- Consider this minimized/modified test-case: ... $ cat libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-1.c /* { dg-do run } */ #include <stdlib.h> const int nw = 64; int main (void) { const int n = 10; int i; int array[n]; for (i = 0; i < n; i++) array[i] = i + 1; { int res, vres; res = 0; #pragma acc parallel num_workers (nw) copy (res) #pragma acc loop worker reduction (+:res) for (i = 0; i < n; i++) res = res + array[i]; vres = 0; for (i = 0; i < n; i++) vres = vres + array[i]; if (res != vres) abort (); } return 0; } ... When compiling with c.exp, we have: ... PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/reduction-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -O0 (test for excess errors) FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/reduction-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -O0 execution test ... while with c++.exp, we have: ... FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/reduction-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -O0 (test for excess errors) PASS: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/reduction-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -O0 execution test ... Looking first at c++, the c++ front-end delivers a hard-coded constant for num_workers: ... $ grep parallel reduction-1.c.004t.original #pragma acc parallel map(tofrom:res) num_workers(64) ... and the num_workers constant is then overridden in the compiler to 32: ... In function 'main._omp_fn.0': libgomp.oacc-c-c++-common/reduction-1.c:21:13: warning: using num_workers (32), ignoring 64 $ grep FUNC_MAP reduction-1.s //:FUNC_MAP "main$_omp_fn$0", 0x1, 0x20, 0x20 ... OTOH, the c frontend delivers a variable for num_workers (only at -O0, otherwise it'll do the same as the c++ front-end): ... $ grep parallel reduction-1.c.004t.original #pragma acc parallel map(tofrom:res) num_workers(nw) ... which cannot be overridden in the compiler (meaning, in nvptx_goacc_validate_dims), given that it's value isn't known: ... $ grep FUNC_MAP reduction-1.s //:FUNC_MAP "main$_omp_fn$0", 0x1, 0, 0x20 ... and at runtime we run into a GOMP_PLUGIN_fatal in the libgomp nvptx plugin: ... libgomp: The Nvidia accelerator has insufficient resources to launch 'main$_omp_fn$0' with num_workers = 64; recompile the program with 'num_workers = 32' on that offloaded region or '-fopenacc-dim=:32' ... For the user, it's somewhat confusing that this passes with warning when compiling as C++, and fails to execute when compiling as C. The difference originates in the front-ends, but that doesn't seem to be openacc-specific, so while it looks possible to fix in the C frontend (basically , make c_fully_fold_internal apply to launch dims even for !optimize), I'm not sure that's a good and acceptable idea. [ And, given this difference, it's probably good to test this behaviour in a dedicated test-case, but otherwise avoid const int for dimension settings in libgomp.oacc-c-c++-common test-case, which are tested for both C and C++. At first glance, this would mean fixing libgomp.oacc-c-c++-common/reduction-[1-5].c . ] OTOH, we can also look at the consequences of the front-end difference, which are either: - an override in the compiler, or - a failure at runtime. [ In other words, we can abstract away from the const int handling, and conclude that we see the same difference for: ... int nw = 64; #pragma acc parallel num_workers (nw) copy (res) ... and ... #pragma acc parallel num_workers (64) copy (res) ... ] While it's clear that in the compiler (that is, in nvptx_goacc_validate_dims) we can't do better, I wonder why we don't do the same in the plugin, that is, override with warning. We would have the more acceptable difference of "compile with warning and run" vs "compile and run with warning".