This patch fixes a race condition bug in
libgomp.oacc-c-c++-common/data-2-lib.c. That is an OpenACC test which
exercises the runtime wait API, for use in conjunction with asynchronous
OpenACC offloaded regions. I not sure why this problem went undetected
for so long. Either the parallel region runs too fast on the GPU so that
the copy'ed out data is correct, or the Nvidia's CUDA runtime blocks all
device->host data transfers until the GPU is no longer processing the
data. I suspect it's the former.

I've applied this patch to trunk and og7 as obvious.

Cesar
2017-12-01  Cesar Philippidis  <ce...@codesourcery.com>

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/data-2-lib.c: Add missing
	call to acc_wait (1).


diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/data-2-lib.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-2-lib.c
index 1694f582363..f553d3d839c 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/data-2-lib.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/data-2-lib.c
@@ -64,6 +64,8 @@ main (int argc, char **argv)
   for (i = 0; i < N; i++)
     b[i] = a[i];
 
+  acc_wait (1);
+
   acc_memcpy_from_device (a, d_a, nbytes);
   acc_memcpy_from_device (b, d_b, nbytes);
   

Reply via email to