https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102424

            Bug ID: 102424
           Summary: OpenACC 'reduction' with outer 'loop seq', inner 'loop
                    gang'
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Keywords: openacc
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: tschwinge at gcc dot gnu.org
                CC: frederik at gcc dot gnu.org
  Target Milestone: ---

Working on OpenACC 'kernels', Frederik noticed that
'libgomp.oacc-fortran/kernels-acc-loop-reduction-2.f90'
(<https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libgomp/testsuite/libgomp.oacc-fortran/kernels-acc-loop-reduction-2.f90;hb=ffbdd78a4a84d80a5303d4f7a20553cf96954db9>)
misbehaves if put into OpenACC 'parallel' form as follows:

    @@ -16,13 +17,13 @@ subroutine bar(vol)
       INTEGER :: vol
       INTEGER :: j,k

    -  !$ACC KERNELS
    -  !$ACC LOOP REDUCTION(+:vol)
    +  !$ACC PARALLEL
    +  !$ACC LOOP SEQ REDUCTION(+:vol)
       DO k=1,2
    -     !$ACC LOOP REDUCTION(+:vol)
    +     !$ACC LOOP GANG VECTOR REDUCTION(+:vol)
          DO j=1,2
            vol = vol + 1
          ENDDO
       ENDDO
    -  !$ACC END KERNELS
    +  !$ACC END PARALLEL
     end subroutine bar

(Unusual here is the outer 'loop' with 'seq' clause.)
GCC accepts this without diagnostic -- but produces unexpected (wrong?) results
at runtime! (Though, not 100 %...)

It seems that generally this can be cured by avoiding gang parallelism in the
inner loop.

The problem can also be cured by putting a explicit 'reduction(+:vol)' clause
onto the compute construct itself (instead of implicit 'copy(vol)' clause per
current GCC implementation) -- and I can see how that triggers different
("proper") handling of 'var' as a reduction variable at the top-level in the
compute region.

In <https://github.com/OpenACC/openacc-spec/issues/410> (only visible to
members of the GitHub OpenACC organization) I'm discussing whether this is a
quality of implementation issue or a specification issue.

Reply via email to