https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121664
Bug ID: 121664
Summary: [Nvptx][OpenMP] 'omp target ... simd' with 'collapse'
– leads to illegal memory access
Product: gcc
Version: 16.0
Status: UNCONFIRMED
Keywords: openmp, wrong-code
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: burnus at gcc dot gnu.org
CC: tschwinge at gcc dot gnu.org
Depends on: 121453
Target Milestone: ---
Created attachment 62196
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=62196&action=edit
test.f90 — compile for Nvptx offloading with "gfortran -fopenmp -O2
-fno-tree-loop-vectorize"
+++ This bug was initially created as a clone of Bug #121453 +++
This shows up with the SPECaccel v2023 testcase '455.seismic' by failing with
nvptx offload as:
libgomp: cuCtxSynchronize error: an illegal memory access was encountered
or in the debugger:
CUDA Exception: Warp Out-of-range Address
For the big program, it occurs with 'src.alt/omp_target' for the first loop:
!$omp target teams distribute parallel do simd collapse(3)
with -O2 or -O3. However, it starts to pass when reducing it, but using
-O2 -fno-tree-loop-vectorize
still makes it fail with the attached simplified testcase.
NOTE: It works on the host or with AMD GPU (gfx90a) offloading, while it fails
with an sm_70 and sm_86 Nvidia GPU.
Passing GOMP_NVPTX_JIT=-O0 or GOMP_NVPTX_JIT=-O2 does not change the result.
And it happens every time, using both 12.2 and 13.0. It does not seem to be a
regression.
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121453
[Bug 121453] [OpenMP] 'omp simd' with 'collapse' – variable '.count'
uninitialized, but used as 'if (.iter.14 == .count.15)'