https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99555
Tom de Vries <vries at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |amonakov at gcc dot gnu.org --- Comment #6 from Tom de Vries <vries at gcc dot gnu.org> --- Current theory ... All omp-threads are supposed to participate in a team barrier, and then all together move on. The master omp-thread participates from gomp_team_end, the other omp-threads from the worker loop in gomp_thread_start. Instead, it seems the master omp-thread gets stuck at the team barrier, while all other omp-threads move on, to the thread pool barrier, and that state corresponds to the observed hang. AFAICT, the problem starts when gomp_team_barrier_wake is called with count == 1: ... void gomp_team_barrier_wake (gomp_barrier_t *bar, int count) { if (bar->total > 1) asm ("bar.sync 1, %0;" : : "r" (32 * bar->total)); } ... The count argument is ignored, and instead all omp-threads are woken up, which causes omp-threads to escape the team barrier. This all is a result of the gomp_barrier_handle_tasks path being taken in gomp_team_barrier_wait_end, and I haven't figured out why that is triggered, so it still may be that the root cause lies elsewhere. Anyway, the nvptx bar.{c,h} is copied from linux/bar.{c,h}, which is implemented using futex, and with futex uses replaced with bar.sync uses. FWIW, replacing libgomp/config/nvptx/bar.{c,h} with libgomp/config/posix.{c,h} fixes the problem. Did a full libgomp test run, all problems fixed.