[Bug c/96844] New: OpenMP: two worksharing constructs with different num_threads clauses break thread pooling

2020-08-29 Thread mority at posteo dot net
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96844

Bug ID: 96844
   Summary: OpenMP: two worksharing constructs with different
num_threads clauses break thread pooling
   Product: gcc
   Version: 10.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mority at posteo dot net
  Target Milestone: ---

Created attachment 49154
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49154&action=edit
Code that produces bug

Hi,

if a for loop contains two OpenMP worksharing constructs which specify
different values in their num_threads clauses, thread pooling seems not to be
working correctly. 

E.g., the first worksharing construct has num_threads(2) and the second
num_threads(4). The expected behavior would be that a total of 4 threads is
created. The first worksharing construct uses 2 of these threads and the second
all of them. 

However, this seems not be the case. While thread pooling seems to work for the
first worksharing construct, it fails for the second. Every time the second
worksharing construct is executed, 2 new threads are created. This causes
significant overhead.

For clarification: There is no nested parallelism.

The attached code can be used to reproduce the bug. The code can be compiled
into 4 different versions using conditional compilation:

1. no OpenMP
gcc -O3 -I. -Wall -g -DPRINT_TID mwe2_woMPI.c -o mwe2_woMPI

2. worksharing construct foo only
gcc -O3 -I. -Wall -g -DPRINT_TID -DPRAGMA_FOO -fopenmp mwe2_woMPI.c -o
mwe2_woMPI_foo

3. worksharing construct bar only
gcc -O3 -I. -Wall -g -DPRINT_TID -DPRAGMA_BAR -fopenmp mwe2_woMPI.c -o
mwe2_woMPI_bar

4. both worksharing constructs
gcc -O3 -I. -Wall -g -DPRINT_TID -DPRAGMA_FOO -DPRAGMA_BAR -fopenmp
mwe2_woMPI.c -o mwe2_woMPI_foobar

I analyzed the output of the different versions which contains the thread id
for every iteration. Each worksharing construct in isolation works correctly
and 2 or 4 threads are created, respectively. However, if both worksharing
constructs are used at the same time, the first worksharing construct uses 2
different threads and the second 22 different threads.

GCC versions 8.3, 9.2. and 10.2 all show this behavior. I also compiled the
code with clang 10.1 and icc 19.4 which both handle the case correctly.

[Bug c/96844] OpenMP: two worksharing constructs with different num_threads clauses break thread pooling

2020-08-29 Thread mority at posteo dot net
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96844

--- Comment #1 from Moritz Fischer  ---
Created attachment 49155
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49155&action=edit
python script to count number of different threads used for each worksharing
construct