[Bug libgomp/97213] New: OpenMP "if" is dramatically slower than code-level "if" - why?

2020-09-26 Thread ttsiodras at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97213

Bug ID: 97213
   Summary: OpenMP "if" is dramatically slower than code-level
"if" - why?
   Product: gcc
   Version: 10.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libgomp
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ttsiodras at gmail dot com
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

In trying to understand how OpenMP `task` works, I did this benchmark:

#include 
#include 

long fib(int val)
{
if (val < 2)
return val;

long total = 0;
{
#pragma omp task shared(total) if(val==45)
total += fib(val-1);
#pragma omp task shared(total) if(val==45)
total += fib(val-2);
#pragma omp taskwait
}
return total;
}

int main()
{
#pragma omp parallel
#pragma omp single
{
long res = fib(45);
printf("fib(45)=%ld\n", res);
}
}

It's a simple Fibonacci calculation, that only spawns two tasks at the
top-level of fib(45) - basically, one thread does fib(44), the other does
fib(43); and the results are added and returned.

I know there's a chance for a race on the "+=" of the total - but that's not
the point of this... Here's the performance in my i5 laptop:

$ gcc -O2 with_openmp_if.c -fopenmp

$ time ./a.out 
fib(45)=1134903170

real1m4.244s
user1m44.696s
sys 0m0.010s

64 seconds... Now compare this, to the same code, but with the "if" moved from
OpenMP level, to user code level - i.e. this change in "fib":

long fib(int val)
{
if (val < 2)
return val;

long total = 0;
{
if (val == 45) {
#pragma omp task shared(total)
total += fib(val-1);
#pragma omp task shared(total)
total += fib(val-2);
#pragma omp taskwait
} else
return fib(val-1) + fib(val-2);
}
return total;
}

$ gcc -O2 with_normal_if.c -fopenmp

$ time ./a.out 
fib(45)=1134903170

real0m8.585s
user0m14.021s
sys 0m0.011s

We go from 64 seconds down to 8.5 seconds.

Why? 

What does the OpenMP-level "if" do so differently, that it causes an order of
magnitude less performance?

[Bug libgomp/97213] OpenMP "if" is dramatically slower than code-level "if" - why?

2020-09-26 Thread ttsiodras at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97213

--- Comment #2 from Thanassis Tsiodras  ---
I see. I was not aware of "mergeable", TBH - thanks for pointing it out (it led
me to reading about "data environments"). 

Thanks, Jakub.

[Bug libgomp/97213] OpenMP "if" is dramatically slower than code-level "if" - why?

2020-09-26 Thread ttsiodras at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97213

Thanassis Tsiodras  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #4 from Thanassis Tsiodras  ---
Marking as resolved.