On 2/15/22 12:08, Thomas Schwinge wrote:
Hi Tom!
On 2022-02-15T11:52:29+0100, Tom de Vries <tdevr...@suse.de> wrote:
On 2/15/22 08:34, Thomas Schwinge wrote:
For my understanding:
Thanks for your explanations!
It is expected that this changes, for example (similar elsewhere)
'nvptx-none/libatomic/store_4_.o', to use (a) 'atom.exch' (with a new
dummy register allocated)
Yes.
We could do slightly better by emitting that as:
...
membar.sys;
{ .reg .u32 dummy;
atom.exch.b32 dummy,[%r22],%r23;
}
membar.sys;
...
which could improve register pressure.
Or, use the "bit bucket" operand -- assuming that's applicable here?
atom.exch.b32 _,[%r22],%r23;
Ah, yes, that looks exactly what we need, thanks for pointing that out :)
I'll try to create a patch for this.
Thanks,
- Tom
For example, see PTX 3.1, 8.2 "PTX Instructions".
Grüße
Thomas
I just wrote a patch for that (attached, ftr), but using a scratch
register, and it seems that this similar code:
...
void
foo (U_4 *mptr, U_4 newval)
{
__atomic_exchange_n (mptr, newval, 5);
}
...
still maps to:
...
.reg .u32 %r24;
membar.sys;
atom.exch.b32 %r24,[%r22],%r23;
membar.sys;
...
so that may not be the right way to do it.
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -89,9 +89,10 @@
;; only literal constants, which differ from the generic ones, which
;; permit subregs and symbolc constants (as appropriate)
(define_predicate "nvptx_register_operand"
- (match_code "reg")
+ (match_code "reg,scratch")
{
- return register_operand (op, mode);
+ return (register_operand (op, mode)
+ || (GET_CODE (op) == SCRATCH && GET_MODE (op) == mode));
})
(define_predicate "nvptx_nonimmediate_operand"
@@ -188,7 +189,7 @@
(define_constraint "R"
"A pseudo register."
- (match_code "reg"))
+ (ior (match_code "reg") (match_code "scratch")))
(define_constraint "Ia"
"Any integer constant."
@@ -2036,6 +2037,7 @@
(match_operand:SDIM 2 "nvptx_nonmemory_operand" "Ri"))] ;; input
""
{
+ bool scratch_dst_p = GET_CODE (operands[0]) == SCRATCH;
if (nvptx_mem_local_p (operands[1]))
{
output_asm_insn ("{", NULL);
@@ -2047,7 +2049,9 @@
return "";
}
const char *t
- = "%.\tatom%A1.exch.b%T0\t%0, %1, %2;";
+ = (scratch_dst_p
+ ? "{ .reg.u%T0 dummy; %.\tatom%A1.exch.b%T0\t dummy,%1, %2; }"
+ : "%.\tatom%A1.exch.b%T0\t%0, %1, %2;");
return nvptx_output_atomic_insn (t, operands, 1, 3);
}
[(set_attr "atomic" "true")])
@@ -2079,7 +2083,7 @@
/* Fall back to expand_atomic_store. */
FAIL;
- rtx tmpreg = gen_reg_rtx (<MODE>mode);
+ rtx tmpreg = gen_rtx_SCRATCH (<MODE>mode);
emit_insn (gen_atomic_exchange<mode> (tmpreg, operands[0], operands[1],
operands[2]));
DONE;
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht
München, HRB 106955