https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81352
--- Comment #1 from Tom de Vries <vries at gcc dot gnu.org> ---
Confirmed.
This program (minimized from nested-function-1.f90) hangs at -O1 (with driver
384.111 on quadro m1200, for GOMP_NVPTX_JIT=-O[0-4]):
...
program foo
integer :: a(3,3), l, ll
a = 0
!$acc parallel num_gangs (1) num_workers(1)
do l=1,3
!$acc loop vector
do ll=1,3
a(l,ll) = 2
enddo
enddo
if (any(a(1:3,1:3).ne.2)) call abort
!$acc end parallel
end program foo
...
The generated ptx for the abort is:
...
@ %r79 bra $L18;
{
call _gfortran_abort;
trap;
exit;
}
$L18:
...
With SASS code (at GOMP_NVPTX_JIT=-O4)
...
/*05d8*/ @P0 BRA `(.L_18);
/*05e8*/ JCAL `(_gfortran_abort);
/*05f0*/ BPT.TRAP 0x1;
/*05f8*/ EXIT;
.L_18:
...
In other words, no convergence point for the diverging branch and the
fallthrough executes random code.
When moving the exit to after L19, we get instead:
...
/*0678*/ @P0 EXIT;
/*0688*/ JCAL `(_gfortran_abort);
/*0690*/ BPT.TRAP 0x1;
/*0698*/ EXIT;
...
No convergence point, but both paths lead to exit.
And when moving both trap and exit to after L19, we get the convergence point:
...
/*0678*/ SSY `(.L_30);
/*0688*/ @P0 SYNC (*"TARGET= .L_30 "*);
/*0690*/ JCAL `(_gfortran_abort);
/*0698*/ SYNC (*"TARGET= .L_30 "*);
.L_30:
/*06a8*/ BPT.TRAP 0x1;
/*06b0*/ EXIT;
...
With both types of modification, the program doesn't hang anymore.