On 2/15/22 12:08, Thomas Schwinge wrote:
Hi Tom!

On 2022-02-15T11:52:29+0100, Tom de Vries <tdevr...@suse.de> wrote:
On 2/15/22 08:34, Thomas Schwinge wrote:
For my understanding:

Thanks for your explanations!

It is expected that this changes, for example (similar elsewhere)
'nvptx-none/libatomic/store_4_.o', to use (a) 'atom.exch' (with a new
dummy register allocated)

Yes.

We could do slightly better by emitting that as:
...
membar.sys;
{  .reg .u32 dummy;
     atom.exch.b32 dummy,[%r22],%r23;
}
membar.sys;
...
which could improve register pressure.

Or, use the "bit bucket" operand -- assuming that's applicable here?

     atom.exch.b32 _,[%r22],%r23;


Ah, yes, that looks exactly what we need, thanks for pointing that out :)

I'll try to create a patch for this.

Thanks,
- Tom

For example, see PTX 3.1, 8.2 "PTX Instructions".


Grüße
  Thomas


I just wrote a patch for that (attached, ftr), but using a scratch
register, and it seems that this similar code:
...
void
foo (U_4 *mptr, U_4 newval)
{
    __atomic_exchange_n (mptr, newval, 5);
}
...
still maps to:
...
.reg .u32 %r24;
membar.sys;
atom.exch.b32 %r24,[%r22],%r23;
membar.sys;
...
so that may not be the right way to do it.

--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -89,9 +89,10 @@
  ;; only literal constants, which differ from the generic ones, which
  ;; permit subregs and symbolc constants (as appropriate)
  (define_predicate "nvptx_register_operand"
-  (match_code "reg")
+  (match_code "reg,scratch")
  {
-  return register_operand (op, mode);
+  return (register_operand (op, mode)
+       || (GET_CODE (op) == SCRATCH && GET_MODE (op) == mode));
  })

  (define_predicate "nvptx_nonimmediate_operand"
@@ -188,7 +189,7 @@

  (define_constraint "R"
    "A pseudo register."
-  (match_code "reg"))
+  (ior (match_code "reg") (match_code "scratch")))

  (define_constraint "Ia"
    "Any integer constant."
@@ -2036,6 +2037,7 @@
       (match_operand:SDIM 2 "nvptx_nonmemory_operand" "Ri"))] ;; input
    ""
    {
+    bool scratch_dst_p = GET_CODE (operands[0]) == SCRATCH;
      if (nvptx_mem_local_p (operands[1]))
        {
       output_asm_insn ("{", NULL);
@@ -2047,7 +2049,9 @@
       return "";
        }
      const char *t
-      = "%.\tatom%A1.exch.b%T0\t%0, %1, %2;";
+      = (scratch_dst_p
+      ? "{ .reg.u%T0 dummy; %.\tatom%A1.exch.b%T0\t dummy,%1, %2; }"
+      : "%.\tatom%A1.exch.b%T0\t%0, %1, %2;");
      return nvptx_output_atomic_insn (t, operands, 1, 3);
    }
    [(set_attr "atomic" "true")])
@@ -2079,7 +2083,7 @@
      /* Fall back to expand_atomic_store.  */
      FAIL;

-  rtx tmpreg = gen_reg_rtx (<MODE>mode);
+  rtx tmpreg = gen_rtx_SCRATCH (<MODE>mode);
    emit_insn (gen_atomic_exchange<mode> (tmpreg, operands[0], operands[1],
                                       operands[2]));
    DONE;
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

Reply via email to