Hi! On 2021-05-21T16:28:57+0200, I wrote: > This came into existance internally, when the og10 branch was set up. > > On 2020-06-03T17:23:51+0200, Tobias Burnus <tobias_bur...@mentor.com> wrote: >> This fixes [...] on OG10 (og10_prerelease); it will be >> later applied to gcn/… to fix the issue. (Upstream is unaffected.) >> [...] > > However, that means that your testcase does work on master branch (and > would regress if certain commits got pushed there). As the testcase has > got a property useful for a thing I'm currently working on, I've pushed > to master branch "Add 'libgomp.oacc-fortran/privatized-ref-2.f90'" in > commit 61796dc03befa9b7426d5bc7c336cca585944143
After commit a78b1ab1df9ca44acc5638e8f9d0ae2e62bd65ed "amdgcn: Tune default OpenMP/OpenACC GPU utilization", we'd seen this test case regress (only) on our AMD GPU amd-instinct1/'-march=gfx908' system: {+WARNING: program timed out.+} [-PASS:-]{+FAIL:+} libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa -O0 execution test Same for other optimization levels. Nothing more in 'libgomp.log'. I have determined this is a latent problem in the original test case, which contains a few instances of code as follows: !$acc parallel copyout(array) array = [(-i, i = 1, nn)] !$acc loop gang private(array) do i = 1, 10 array(i) = i end do if (any (array /= [(-i, i = 1, nn)])) error stop 1 !$acc end parallel Given the '!$acc loop gang', the whole containing '!$acc parallel' region is launched with gang parallelism. The '!$acc loop gang' executes in gang-partitioned mode, but the 'array' assignment before and checks after don't execute in a (hypothetical) gang-single mode, but instead in gang-redundant mode, meaning that each gang executes these concurrently, giving rise to data races and other mischief. Thus, we have to make sure that we're not executing non-parallelized code in gang-redundant mode, by putting these parts into their own 'parallel' constructs, which then default to 'num_gangs(1)'. Pushed to master branch commit f8187b5c0d22723c8e0a3d13d0ea5dd7ecfeff75 "Fix OpenACC gang-redundant execution in 'libgomp.oacc-fortran/privatized-ref-2.f90'", see attached. Grüße Thomas > I confirm that "FIXME: Fails due to PR middle-end/95499" is still a > problem. > > And, GCC '-O' reports: > > [...]/libgomp.oacc-fortran/privatized-ref-2.f90:147:21: > > 147 | subroutine foobar15 (scalar) > | ^ > Warning: ‘foobar15’ defined but not used [-Wunused-function] > [...]/libgomp.oacc-fortran/privatized-ref-2.f90: In function ‘MAIN__’: > [...]/libgomp.oacc-fortran/privatized-ref-2.f90:31:22: warning: > ‘a.offset’ is used uninitialized [-Wuninitialized] > 31 | A = [(3*j, j=1, 10)] > | ^ > [...]/libgomp.oacc-fortran/privatized-ref-2.f90:27:30: note: ‘a’ declared > here > 27 | integer, allocatable :: A(:) > | ^ > [...]/libgomp.oacc-fortran/privatized-ref-2.f90:31:22: warning: > ‘a.dim[0].lbound’ is used uninitialized [-Wuninitialized] > 31 | A = [(3*j, j=1, 10)] > | ^ > [...]/libgomp.oacc-fortran/privatized-ref-2.f90:27:30: note: ‘a’ declared > here > 27 | integer, allocatable :: A(:) > | ^ > [...]/libgomp.oacc-fortran/privatized-ref-2.f90:31:22: warning: > ‘a.dim[0].ubound’ is used uninitialized [-Wuninitialized] > 31 | A = [(3*j, j=1, 10)] > | ^ > [...]/libgomp.oacc-fortran/privatized-ref-2.f90:27:30: note: ‘a’ declared > here > 27 | integer, allocatable :: A(:) > | ^ > > I haven't looked into these. > > > Grüße > Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
>From f8187b5c0d22723c8e0a3d13d0ea5dd7ecfeff75 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge <tho...@codesourcery.com> Date: Fri, 21 Jan 2022 14:58:23 +0100 Subject: [PATCH] Fix OpenACC gang-redundant execution in 'libgomp.oacc-fortran/privatized-ref-2.f90' This was a latent problem, and this commit here now resolves a regression that after recent commit a78b1ab1df9ca44acc5638e8f9d0ae2e62bd65ed "amdgcn: Tune default OpenMP/OpenACC GPU utilization" we had (only) seen on a GCN offloading '-march=gfx908' system: {+WARNING: program timed out.+} [-PASS:-]{+FAIL:+} libgomp.oacc-fortran/privatized-ref-2.f90 -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa -O0 execution test Same for other optimization levels. Make sure that we're not executing non-parallelized code in gang-redundant mode, by putting these parts into their own 'parallel' constructs, which then default to 'num_gangs(1)'. libgomp/ * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Fix OpenACC gang-redundant execution. --- .../libgomp.oacc-fortran/privatized-ref-2.f90 | 42 ++++++++++++++----- 1 file changed, 32 insertions(+), 10 deletions(-) diff --git a/libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-2.f90 index f4a6af986e8..6bd17148911 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-2.f90 +++ b/libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-2.f90 @@ -53,12 +53,10 @@ contains integer :: array(nn) !$acc parallel copyout(array) ! { dg-line l_compute[incr c_compute] } - ! { dg-note {variable 'atmp\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } - ! { dg-note {variable 'shadow_loopvar\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } - ! { dg-note {variable 'offset\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } ! { dg-note {variable 'S\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } - ! { dg-note {variable 'test\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } array = [(-i, i = 1, nn)] + !$acc end parallel + !$acc parallel copy(array) !$acc loop gang private(array) ! { dg-line l_loop[incr c_loop] } ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } ! { dg-note {variable 'array' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "" { target *-*-* } l_loop$c_loop } @@ -66,6 +64,13 @@ contains do i = 1, 10 array(i) = i end do + !$acc end parallel + !$acc parallel copyin(array) ! { dg-line l_compute[incr c_compute] } + ! { dg-note {variable 'test\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } + ! { dg-note {variable 'atmp\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } + ! { dg-note {variable 'shadow_loopvar\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } + ! { dg-note {variable 'offset\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } + ! { dg-note {variable 'S\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } if (any (array /= [(-i, i = 1, nn)])) error stop 1 !$acc end parallel end subroutine foo @@ -74,14 +79,10 @@ contains integer :: array(:) !$acc parallel copyout(array) ! { dg-line l_compute[incr c_compute] } - ! { dg-note {variable 'atmp\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } - ! { dg-note {variable 'shadow_loopvar\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } - ! { dg-note {variable 'offset\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } ! { dg-note {variable 'S\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } - ! { dg-note {variable 'test\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } - ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } - ! { dg-note {variable 'A\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: static} "" { target *-*-* } l_compute$c_compute } array = [(-2*i, i = 1, size(array))] + !$acc end parallel + !$acc parallel copy(array) !$acc loop gang private(array) ! { dg-line l_loop[incr c_loop] } ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } ! { dg-note {variable 'array\.[0-9]+' in 'private' clause is candidate for adjusting OpenACC privatization level} "" { target *-*-* } l_loop$c_loop } @@ -91,6 +92,11 @@ contains do i = 1, 10 array(i) = 9*i end do + !$acc end parallel + !$acc parallel copyin(array) ! { dg-line l_compute[incr c_compute] } + ! { dg-note {variable 'test\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } + ! { dg-note {variable 'A\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: static} "" { target *-*-* } l_compute$c_compute } + ! { dg-note {variable 'S\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute } if (any (array /= [(-2*i, i = 1, 10)])) error stop 2 !$acc end parallel end subroutine bar @@ -100,6 +106,8 @@ contains !$acc parallel copyout(str) str = "abcdefghij" + !$acc end parallel + !$acc parallel copy(str) !$acc loop gang private(str) ! { dg-line l_loop[incr c_loop] } ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } ! { dg-note {variable 'str' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "" { target *-*-* } l_loop$c_loop } @@ -110,6 +118,8 @@ contains do i = 1, 10 str(i:i) = achar(ichar('A') + i) end do + !$acc end parallel + !$acc parallel copyin(str) if (str /= "abcdefghij") error stop 3 !$acc end parallel end @@ -122,10 +132,14 @@ contains ! *************************************** !!$acc parallel copyout(str) str = "abcdefghij" + !!$acc end parallel + !!$acc parallel copy(str) !!$acc loop gang private(str) !do i = 1, 10 ! str(i:i) = achar(ichar('A') + i) !end do + !!$acc end parallel + !!$acc parallel copyin(str) if (str /= "abcdefghij") error stop 5 !!$acc end parallel end @@ -135,6 +149,8 @@ contains !$acc parallel copyout(scalar) scalar = "abcdefghi-12345" + !$acc end parallel + !$acc parallel copy(scalar) !$acc loop gang private(scalar) ! { dg-line l_loop[incr c_loop] } ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop } ! { dg-note {variable 'scalar' in 'private' clause potentially has improper OpenACC privatization level: 'parm_decl'} "" { target *-*-* } l_loop$c_loop } @@ -145,7 +161,9 @@ contains scalar(i:i) = achar(ichar('A') + i) end do !$acc end parallel + !$acc parallel copyin(scalar) if (scalar /= "abcdefghi-12345") error stop 6 + !$acc end parallel end subroutine foobar subroutine foobar15 (scalar) integer :: i @@ -153,11 +171,15 @@ contains !$acc parallel copyout(scalar) scalar = "abcdefghi-12345" + !$acc end parallel + !$acc parallel copy(scalar) !$acc loop gang private(scalar) do i = 1, 15 scalar(i:i) = achar(ichar('A') + i) end do !$acc end parallel + !$acc parallel copyin(scalar) if (scalar /= "abcdefghi-12345") error stop 1 + !$acc end parallel end subroutine foobar15 end -- 2.34.1