https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82400
Bug ID: 82400 Summary: [openacc, nvptx] Use ptx atomic operators for reductions Product: gcc Version: 8.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: vries at gcc dot gnu.org Target Milestone: --- Atm the nvptx_reduction_update uses nvptx_lockless_update for types with size <= 8 bytes: ... /* Insert code to locklessly update *PTR with *PTR OP VAR just before GSI. We use a lockless scheme for nearly all case, which looks like: actual = initval(OP); do { guess = actual; write = guess OP myval; actual = cmp&swap (ptr, guess, write) } while (actual bit-different-to guess); return write; This relies on a cmp&swap instruction, which is available for 32- and 64-bit types. Larger types must use a locking scheme. */ static tree nvptx_lockless_update (location_t loc, gimple_stmt_iterator *gsi, tree ptr, tree var, tree_code op) ... The scheme is the same for all operators, using the compare-and-swap atomic ptx instruction (atom.cas). However, some of the operators are supported natively for the ptx: ... .op = { .and, .or, .xor, .cas, .exch, .add, .inc, .dec, .min, .max }; ... so for f.i. addition we could use an atom.add instead.