https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96428
Thomas Schwinge <tschwinge at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Ever confirmed|0 |1 Status|RESOLVED |REOPENED Resolution|FIXED |--- CC| |tschwinge at gcc dot gnu.org Last reconfirmed| |2020-10-01 --- Comment #7 from Thomas Schwinge <tschwinge at gcc dot gnu.org> --- First: Tobias, Tom, thanks for fixing this issue! (In reply to Tobias Burnus from comment #3) > Created attachment 48988 [details] > Test case (as diff – two files) These attachment 48988 testcases got included in commit 344f09a756ebd50510cc1eb3db111fd61c527702. I don't understand 'libgomp.oacc-fortran/pr96628-part1.f90': module m2 real*8 :: mysum !$acc declare device_resident(mysum) So 'mysum' lives in device-global memory. contains SUBROUTINE one(t) !$acc routine REAL*8, INTENT(IN) :: t(:) mysum = sum(t) END SUBROUTINE one This now writes into device-global 'mysum', potentially from several gang/worker/vector threads in parallel, race condition? SUBROUTINE two(t) !$acc routine seq REAL*8, INTENT(INOUT) :: t(:) t = (100.0_8*t)/sum END SUBROUTINE two end module m2 source-gcc/libgomp/testsuite/libgomp.oacc-fortran/pr96628-part1.f90: In function ‘__m2_MOD_two’: source-gcc/libgomp/testsuite/libgomp.oacc-fortran/pr96628-part1.f90:18: warning: ‘sum’ is used uninitialized [-Wuninitialized] 18 | t = (100.0_8*t)/sum So, is this really testing what it means to be testing? Should the testcase get some 'target openacc_nvidia_accel_selected' 'scan-offload-rtl-dump' added to make sure that we're actually generating the expected PTX instructions? Also, the testcase files should be renamed 'libgomp.oacc-fortran/pr96428-*' to match the PR ID. (In reply to Tom de Vries from comment #4) > FTR, this is not the leanest solution. > followup patch: [...] > we have instead: [simpler] Any plans to apply that as a follow-up?