https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100321

            Bug ID: 100321
           Summary: [OpenMP][nvptx] (Con't) Reduction fails with
                    optimization and 'loop'/'for simd' but not with 'for'
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Keywords: openmp, wrong-code
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: burnus at gcc dot gnu.org
                CC: vries at gcc dot gnu.org
  Target Milestone: ---
            Target: nvptx-none

Created attachment 50703
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50703&action=edit
target_parallel_for_simd.cpp - compile with g++ -fopenmp -O1 (and nvptx
offloading)

Similar to PR target/100232 
I had hoped that the posted patch does solves this issue as well, but it does
not :-/
[ https://gcc.gnu.org/pipermail/gcc-patches/2021-April/569038.html ]
(However, it does solve the two sollve_vv issue, I mentioned in PR100232 :-)
Thanks!)

Namely, https://github.com/TApplencourt/OvO 's
test_src/cpp/hierarchical_parallelism/reduction_add-complex_double/target_parallel_for_simd.cpp

(also attached) works on the host and AMD GCN, but with nvptx:

  g++ -fopenmp -O1 target_parallel_for_simd.cpp -foffload=-latomic

it fails as

  Expected: (32768,0) Got: (1024,0)

(with exist status code 112)

The -O1 is needed due to the missing .alias.

When removing the 'simd' from
    #pragma omp target parallel for simd map(tofrom: counter_N0) reduction(+:
counter_N0)
it does work.

Reply via email to