https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121416
Tobias Burnus <burnus at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |tschwinge at gcc dot gnu.org --- Comment #1 from Tobias Burnus <burnus at gcc dot gnu.org> --- It works with OpenMP: #define N 100 static _Complex double ary[100], sum = 0, prod = 1; void team_call () { _Complex double tsum = 0, tprod = 1; #pragma omp target teams distribute parallel for map(to: ary) reduction(+:tsum) reduction(*:tprod) for (int ix = 0; ix < N; ix++) { tsum += ary[ix]; tprod *= ary[ix]; } __builtin_printf("debug:\n%f + i %f\n%f + i %f\n", __real__ prod, __imag__ prod, __real__ tprod, __imag__ tprod); } int main() { for (int ix = 0; ix < N; ix++) { double frac = ix * (1.0 / 1024) + 1.0; ary[ix] = frac + frac * 2.0j - 1.0j; sum += ary[ix]; prod *= ary[ix]; } team_call (); } * * * Note that the generated code uses GOMP_atomic_start (); … GOMP_atomic_end (); and no ISA …_atomic intrinisc. The two functions are in libgomp/atomic.c: /* This mutex is used when atomic operations don't exist for the target in the mode requested. The result is not globally atomic, but works so long as all parallel references are within #pragma omp atomic directives. According to responses received from o...@openmp.org, appears to be within spec. Which makes sense, since that's how several other compilers handle this situation as well. */ GOMP_atomic_start (void) { gomp_mutex_lock (&atomic_lock); } GOMP_atomic_end (void) { gomp_mutex_unlock (&atomic_lock); } Likewise with 'double' if there are two reductions - or one complex double, but if there is only one double reduction, it uses (twice): flat_atomic_cmpswap_X2 v[8:9], v[4:5], v[16:19] sc0 ; tmp775,* _45, tmp776 * * * With OpenACC, I see two flat_atomic_cmpswap v0, v[4:5], v[16:17] sc0 but no GOMP… function call. * * * I wonder whether for OpenACC, something similar is needed as for OpenMP. quoting gcc/omp-expand.cc: /* Expand an GIMPLE_OMP_ATOMIC statement. We try to expand using expand_omp_atomic_fetch_op. If it failed, we try to call expand_omp_atomic_pipeline, and if it fails too, the ultimate fallback is wrapping the operation in a mutex (expand_omp_atomic_mutex). REGION is the atomic region built by build_omp_regions_1(). */ static void expand_omp_atomic (struct omp_region *region)