Issue 145165
Summary #pragma omp critical/#pragma omp atomic for a shared variable on device yields wrong values on nvptx.
Labels new issue
Assignees
Reporter bschulz81
    compile with  -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda 
```

#include <omp.h>
#include <vector>
#include <iostream>
int main()
{
    size_t elements=20;
 std::vector<double> v1(elements),v2(elements);

    #pragma omp parallel for simd
    for(size_t i=1;i<elements;i++)
      {
 v1[i]=(double)i;
          v2[i]=(double)i;
      }

    double* v1d=v1.data(),*v2d=v2.data();

   double tmp=0;
   #pragma omp target enter data map (to:v1d[0:elements])
   #pragma omp target enter data map (to:v2d[0:elements])

    #pragma omp target teams distribute parallel for shared(tmp)
    for(size_t i=1;i<20;i++)
    {
        #pragma omp critical
       tmp+=v1d[i]*v2d[i];

    }
 std::cout<<tmp<<"\n";
}
```
yields 0 instead of 2470.

if one replaces the 
`#pragma omp target teams distribute parallel for shared(tmp)`
by
` #pragma omp parallel for shared(tmp)`
on the host, then the result is correct

if one replaces the pragma 
`#pragma omp target teams distribute parallel for shared(tmp)`
by
`#pragma omp target teams distribute reduction(+:tmp)`

then the result is also correct, even though according to the OpenMP 5x standard
`teams distribute`  has no reduction clause.

If one replaces the pragma
`#pragma omp critical `
by
`#pragma omp atomic`

the results are the same.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to