https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82400

            Bug ID: 82400
           Summary: [openacc, nvptx] Use ptx atomic operators for
                    reductions
           Product: gcc
           Version: 8.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vries at gcc dot gnu.org
  Target Milestone: ---

Atm the nvptx_reduction_update uses nvptx_lockless_update for types with size
<= 8 bytes:
...
/* Insert code to locklessly update *PTR with *PTR OP VAR just before           
   GSI.  We use a lockless scheme for nearly all case, which looks              
   like:                                                                        
     actual = initval(OP);                                                      
     do {                                                                       
       guess = actual;                                                          
       write = guess OP myval;                                                  
       actual = cmp&swap (ptr, guess, write)                                    
     } while (actual bit-different-to guess);                                   
   return write;                                                                

   This relies on a cmp&swap instruction, which is available for 32-            
   and 64-bit types.  Larger types must use a locking scheme.  */

static tree
nvptx_lockless_update (location_t loc, gimple_stmt_iterator *gsi,
                       tree ptr, tree var, tree_code op)
...

The scheme is the same for all operators, using the compare-and-swap atomic ptx
instruction (atom.cas).

However, some of the operators are supported natively for the ptx:
...
.op = { .and, .or, .xor, .cas, .exch, .add, .inc, .dec, .min, .max };
...
so for f.i. addition we could use an atom.add instead.

Reply via email to