[Bug target/82399] New: [openacc, nvptx] Optimize complex reduction

vries at gcc dot gnu.org Mon, 02 Oct 2017 04:54:08 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82399


            Bug ID: 82399
           Summary: [openacc, nvptx] Optimize complex reduction
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vries at gcc dot gnu.org
  Target Milestone: ---

Currently reduction updates are implemented like this:
...
/* Emit a sequence to update a reduction accumlator at *PTR with the            
   value held in VAR using operator OP.  Return the updated value.              

   TODO: optimize for atomic ops and indepedent complex ops.  */

static tree
nvptx_reduction_update (location_t loc, gimple_stmt_iterator *gsi,
                        tree ptr, tree var, tree_code op)
{
  tree type = TREE_TYPE (var);
  tree size = TYPE_SIZE (type);

  if (size == TYPE_SIZE (unsigned_type_node)
      || size == TYPE_SIZE (long_long_unsigned_type_node))
    return nvptx_lockless_update (loc, gsi, ptr, var, op);
  else
    return nvptx_lockfull_update (loc, gsi, ptr, var, op);
}
...

This means that for f.i. a complex long long addition we choose the
nvptx_lockfull_update.

The real and the complex part of the addition are independent, so instead we
could call nvptx_lockless_update twice (as the TODO implies).

[Bug target/82399] New: [openacc, nvptx] Optimize complex reduction

Reply via email to