Arsen Arsenović wrote:
Here's a fix for another flaky test.

In r11-3059-g8183ebcdc1c843, Julian fixed a few issues with
atomic_capture-2.c relying on iteration order guarantees that do not
exist under OpenACC parallelized loops and, notably, do not happen even
by accident on AMDGCN.

Namely, it is unspecified whether for

#pragma acc parallel loop
  for (i = 0; i < N; i++)
  #pragma acc atomic capture
      { idata[i] = igot; igot = i; }

This will once use the original value 1234 for igot
and otherwise one previous 'i'. Thus, idata[...] has
all i = 0 to N-1, except for one.

* * *

Likewise for the others.

The atomic_capture-3.c testcase was made by copying it from
atomic_capture-2.c and adding additional options in commit
r12-310-g4cf3b10f27b199, but from an older version of
atomic_capture-2.c, which lacked these ordering fixes fixes, so they
resurfaced in this test.

atomic_capture-3.c seems to be identical to atomic_capture-2.c
except for the explicitly added:

/* { dg-additional-options "-fmodulo-sched -fmodulo-sched-allow-regmoves" } */

for PR rtl-optimization/100225 and PR rtl-optimization/84878.

* * *

This patch ports those fixes from atomic_capture-2.c into
atomic_capture-3.c.

libgomp/ChangeLog:

        * testsuite/libgomp.oacc-c-c++-common/atomic_capture-3.c: Copy
        changes in r11-3059-g8183ebcdc1c843 from atomic_capture-2.c.

LGTM - thanks for digging!

Tobias

PS: I wonder whether it wouldn't have been more sensible to use
  #include "atomic_capture-2.c"  and mentioned the PRs as comment.
But that's a 2021 topic ...

Reply via email to