On 4/15/25 11:52 AM, Richard Sandiford wrote:
Indu Bhagat <indu.bha...@oracle.com> writes:
Using post-index st2g is a faster way of memory tagging/untagging.
Because a post-index 'st2g tag, [addr], #32' is equivalent to:
stg tag, addr, #0
stg tag, addr, #16
add addr, addr, #32
TBD:
- Currently generated by in the aarch64 backend. Not sure if this is
the right way to do it.
If we do go for the "aarch64_granule_memory_operand" approach that
I described for patch 3, then that predicate (and the associated constrant)
could handle PRE_MODIFY and POST_MODIFY addresseses, which would remove
the need for separate patterns.
I think I understand :) I will try it out. I guess one of the unknowns
for me is whether the PRE_MODIFY / POST_MODIFY will be generated as
expected, even when the involved instructions have an unspec...
- Also not clear how to weave in the generation of stz2g.
I think stz2g could be:
(set (match_operand:OI 0 "aarch64_granule_memory_operand" "+<new constraint>")
(unspec_volatile:OI
[(const_int 0)
(match_operand:DI 1 "register_operand" "rk")]
UNSPECV...))
The question I have is what changes will be necessary to have the
compiler DTRT:
i.e. for the zero-init case, instead of
stg x1, [x1, #0]
str wzr, [x1]
generate
stzg x0, [x0]
Similarly for the value init case, instead of
stg x0, [x0, #0]
mov w1, 42
str w1, [x0]
generate
mov w1, #42
stgp x1, xzr, [x0]
I guess once I have worked out the patterns for above, I should see the
combiner in action DTRT, but I dont know for sure if something else in
the compiler will also need adjustments for these new MTE insns.
I think in practice stz2g will need a separate pattern from st2g,
rather than being an alternatives of the same pattern. (That's because
the suggested pattern for st2g uses a (match_dup 0), which isn't subject
to constraint matching.)
Thanks,
Richard
ChangeLog:
* gcc/config/aarch64/aarch64.md
---
[New in RFC V2]
---
gcc/config/aarch64/aarch64.md | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index d3223e275c51..175aed3146ac 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -8495,6 +8495,26 @@
[(set_attr "type" "memtag")]
)
+;; ST2G with post-index writeback.
+(define_insn "*st2g_post"
+ [(set (mem:QI (unspec:DI
+ [(plus:DI (match_operand:DI 1 "register_operand" "=&rk")
+ (const_int 0))]
+ UNSPEC_TAG_SPACE))
+ (and:QI (lshiftrt:DI (match_operand:DI 0 "register_operand" "rk")
+ (const_int 56)) (const_int 15)))
+ (set (mem:QI (unspec:DI
+ [(plus:DI (match_dup 1) (const_int -16))]
+ UNSPEC_TAG_SPACE))
+ (and:QI (lshiftrt:DI (match_dup 0)
+ (const_int 56)) (const_int 15)))
+ (set (match_dup 1)
+ (plus:DI (match_dup 1) (match_operand:DI 2 "aarch64_granule16_simm9"
"i")))]
+ "TARGET_MEMTAG"
+ "st2g\\t%0, [%1], #%2"
+ [(set_attr "type" "memtag")]
+)
+
;; Load/Store 64-bit (LS64) instructions.
(define_insn "ld64b"
[(set (match_operand:V8DI 0 "register_operand" "=r")