https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122314
Bug ID: 122314
Summary: An omp barrier can execute tasks scheduled after said
barrier
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: libgomp
Assignee: unassigned at gcc dot gnu.org
Reporter: matmal01 at gcc dot gnu.org
CC: jakub at gcc dot gnu.org
Target Milestone: ---
With the following testcase, the existence of the `omp barrier` at the start of
the parallel region introduces a race condition so the final total count in
`full_data` is not 10. (Tested on AArch64 linux -- believe it's not specific
to that arch).
I believe this to be due to the implementation of the barrier.
In config/linux/bar.c `gomp_team_barrier_wait_end` checks if `BAR_TASK_PENDING`
is set on the generation and performs tasks if so.
There is a timing window after having been woken from `do_wait` by a generation
increment and reading the generation.
In this window some other thread could have raced ahead and scheduled a task.
Then when the generation is read the `BAR_TASK_PENDING` flag is set -- and the
task performed is the one after this region.
I'm honestly not 100% sure that this is disallowed by the standard -- the only
language around it I could find is at the bottom of the "Task Scheduling" page
that sais "Task scheduling points dynamically divide task regions into parts.
Each part is executed uninterrupted from start to end"
https://www.openmp.org/spec-html/5.0/openmpsu51.html#x75-2330002.10.6
It doesn't seem to trigger with clang FWIW.
Raising the question to check whether it is indeed disallowed.
(Partly just for information, partly because I'm proposing patches to redesign
the current barrier and want to know whether this behaviour should be changed).
Running the below testcase like so to give crash (gcc-install a recently built
compiler from trunk).
```
vshcmd: > ../gcc-install/bin/gcc testcase.C -O3 -fopenmp -o temp.exe
> lego-c2-qs-78:temp [08:15:42] $
vshcmd: > while
vshcmd: > LD_LIBRARY_PATH="../gcc-install/lib64:$LD_LIBRARY_PATH" \
vshcmd: > OMP_NUM_THREADS=8 \
vshcmd: > ./temp.exe; do echo -n "."; done
> > > .. <snip> ...Aborted (core dumped)
lego-c2-qs-78:temp [08:15:46] $
```
```
#include <omp.h>
extern "C" void abort ();
#define NUM_THREADS 8
unsigned full_data[NUM_THREADS] = {0};
void
test ()
{
#pragma omp parallel num_threads(8)
{
#pragma omp barrier
/* Initialise (something to overwrite any tasks ran at earlier barrier).
This is here in order to get an observable condition from the behaviour
I spotted. */
full_data[omp_get_thread_num ()] = 0;
#pragma omp for
for (int i = 0; i < 10; i++)
#pragma omp task
{
full_data[omp_get_thread_num ()] += 1;
}
}
unsigned total = 0;
for (int i = 0; i < NUM_THREADS; i++)
total += full_data[i];
if (total != 10)
abort ();
}
int
main ()
{
test ();
}
```
N.b. if reproduction is difficult, can run in GDB to force the issue:
```
vshcmd: > LD_LIBRARY_PATH="../gcc-install/lib64:$LD_LIBRARY_PATH" \
vshcmd: > gdb -q ./temp.exe
vshcmd: > break gomp_team_barrier_wait if $_thread == 1
vshcmd: > y
vshcmd: > break gomp_team_barrier_wait_final
vshcmd: > y
vshcmd: > run
> Reading symbols from ./temp.exe...
(gdb) Function "gomp_team_barrier_wait" not defined.
Make breakpoint pending on future shared library load? (y or [n]) Breakpoint 1
(gomp_team_barrier_wait if $_thread == 1) pending.
(gdb) Function "gomp_team_barrier_wait_final" not defined.
Make breakpoint pending on future shared library load? (y or [n]) Breakpoint 2
(gomp_team_barrier_wait_final) pending.
(gdb) Starting program: /local/home/matmal01/temp/temp.exe
warning: File
"/local/home/matmal01/gcc-install/lib64/libstdc++.so.6.0.34-gdb.py"
auto-loading has been declined by your `auto-load safe-path' set to
"$debugdir:$datadir/auto-load".
To enable execution of this file add
add-auto-load-safe-path
/local/home/matmal01/gcc-install/lib64/libstdc++.so.6.0.34-gdb.py
line to your configuration file "/home/matmal01/.gdbinit".
To completely disable this security protection add
set auto-load safe-path /
line to your configuration file "/home/matmal01/.gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual. E.g., run from the shell:
info "(gdb)Auto-loading safe path"
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
[New Thread 0xfffff79cf060 (LWP 3911701)]
[New Thread 0xfffff71bf060 (LWP 3911702)]
[New Thread 0xfffff69af060 (LWP 3911703)]
[New Thread 0xfffff619f060 (LWP 3911704)]
[New Thread 0xfffff598f060 (LWP 3911705)]
[New Thread 0xfffff517f060 (LWP 3911706)]
[New Thread 0xfffff496f060 (LWP 3911707)]
Thread 1 "temp.exe" hit Breakpoint 1, gomp_team_barrier_wait (bar=0x4324c0) at
/local/home/matmal01/gcc-source/libgomp/config/linux/bar.c:127
127 gomp_team_barrier_wait_end (bar, gomp_barrier_wait_start (bar));
(gdb)
vshcmd: > set scheduler-locking on
(gdb)
vshcmd: > define go-to-wait-end
vshcmd: > if $_thread != 1 &&
!$_any_caller_is("gomp_team_barrier_wait_end")
vshcmd: > tbreak gomp_team_barrier_wait_end
vshcmd: > continue
vshcmd: > end
vshcmd: > end
Type commands for definition of "go-to-wait-end".
End with a line saying just "end".
> > > >>(gdb)
vshcmd: > thread apply all go-to-wait-end
... <snip output>
vshcmd: > # At this point know all threads have indicated their arrival at the
vshcmd: > # barrier. Now thread 1 can go through this barrier and spawn some
vshcmd: > # tasks. Have already put a breakpoint on
vshcmd: > # gomp_team_barrier_wait_final so let's continue to there.
vshcmd: > thread 1
vshcmd: > cont
[Switching to thread 1 (Thread 0xfffff79f1e20 (LWP 3911699))]
#0 gomp_team_barrier_wait (bar=0x4324c0) at
/local/home/matmal01/gcc-source/libgomp/config/linux/bar.c:127
127 gomp_team_barrier_wait_end (bar, gomp_barrier_wait_start (bar));
(gdb) Continuing.
Thread 1 "temp.exe" hit Breakpoint 2, gomp_team_barrier_wait_final
(bar=0x4324c0) at
/local/home/matmal01/gcc-source/libgomp/config/linux/bar.c:133
133 gomp_barrier_state_t state = gomp_barrier_wait_final_start (bar);
(gdb)
vshcmd: > # Barrier generation has bit saying that there are tasks waiting.
vshcmd: > print *bar
$1 = {total = 8, generation = 9, awaited = 8, awaited_final = 8}
(gdb)
vshcmd: > # Now we have the problem state -- barrier generation has bit saying
vshcmd: > # that there are tasks waiting, all other tasks are just about to
vshcmd: > # start checking the generation state and deciding whether to run
vshcmd: > # tasks.
vshcmd: > thread 2
[Switching to thread 2 (Thread 0xfffff79cf060 (LWP 3911701))]
#0 gomp_team_barrier_wait_end (bar=0x4324c0, state=0) at
/local/home/matmal01/gcc-source/libgomp/config/linux/bar.c:85
85 if (__builtin_expect (state & BAR_WAS_LAST, 0))
(gdb)
vshcmd: > # ... Snipped, pressing `next` a few times.
vshcmd: > next
116 gomp_barrier_handle_tasks (state);
(gdb)
vshcmd: > next
117 gen = __atomic_load_n (&bar->generation, MEMMODEL_ACQUIRE);
(gdb)
vshcmd: > # Tasks now completed.
vshcmd: > print *(gomp_barrier_t *)0x4324c0
$2 = {total = 8, generation = 8, awaited = 8, awaited_final = 8}
(gdb)
vshcmd: > set scheduler-locking off
(gdb)
vshcmd: > delete
vshcmd: > y
Delete all breakpoints? (y or n) (gdb)
vshcmd: > # Abort because the initialisation overwrote the updates that the
vshcmd: > # tasks performed.
vshcmd: > cont
Continuing.
Thread 1 "temp.exe" received signal SIGABRT, Aborted.
[Switching to Thread 0xfffff79f1e20 (LWP 3911699)]
__pthread_kill_implementation (threadid=281474836143648, signo=signo@entry=6,
no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
44 ./nptl/pthread_kill.c: No such file or directory.
(gdb)
vshcmd: > bt
#0 __pthread_kill_implementation (threadid=281474836143648,
signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#1 0x0000fffff7a8f244 in __pthread_kill_internal (signo=6, threadid=<optimized
out>) at ./nptl/pthread_kill.c:78
#2 0x0000fffff7a4a67c in __GI_raise (sig=sig@entry=6) at
../sysdeps/posix/raise.c:26
#3 0x0000fffff7a37130 in __GI_abort () at ./stdlib/abort.c:79
#4 0x00000000004007d0 in test () at
/local/home/matmal01/gcc-source/libgomp/testsuite/libgomp.c++/task-reduction-22.C:33
#5 main () at
/local/home/matmal01/gcc-source/libgomp/testsuite/libgomp.c++/task-reduction-22.C:39
(gdb)
```