On 4/15/25 11:52 AM, Richard Sandiford wrote:
Indu Bhagat <indu.bha...@oracle.com> writes:
Using post-index st2g is a faster way of memory tagging/untagging.
Because a post-index 'st2g tag, [addr], #32' is equivalent to:
    stg tag, addr, #0
    stg tag, addr, #16
    add addr, addr, #32

TBD:
   - Currently generated by in the aarch64 backend.  Not sure if this is
     the right way to do it.

If we do go for the "aarch64_granule_memory_operand" approach that
I described for patch 3, then that predicate (and the associated constrant)
could handle PRE_MODIFY and POST_MODIFY addresseses, which would remove
the need for separate patterns.


I think I understand :) I will try it out. I guess one of the unknowns for me is whether the PRE_MODIFY / POST_MODIFY will be generated as expected, even when the involved instructions have an unspec...

   - Also not clear how to weave in the generation of stz2g.

I think stz2g could be:

(set (match_operand:OI 0 "aarch64_granule_memory_operand" "+<new constraint>")
      (unspec_volatile:OI
        [(const_int 0)
         (match_operand:DI 1 "register_operand" "rk")]
        UNSPECV...))


The question I have is what changes will be necessary to have the compiler DTRT:

  i.e. for the zero-init case, instead of
        stg     x1, [x1, #0]
        str     wzr, [x1]
  generate
        stzg x0, [x0]

  Similarly for the value init case, instead of
        stg     x0, [x0, #0]
        mov     w1, 42
        str     w1, [x0]
   generate
        mov  w1, #42
        stgp x1, xzr, [x0]

I guess once I have worked out the patterns for above, I should see the combiner in action DTRT, but I dont know for sure if something else in the compiler will also need adjustments for these new MTE insns.

I think in practice stz2g will need a separate pattern from st2g,
rather than being an alternatives of the same pattern.  (That's because
the suggested pattern for st2g uses a (match_dup 0), which isn't subject
to constraint matching.)

Thanks,
Richard


ChangeLog:
        * gcc/config/aarch64/aarch64.md

---
[New in RFC V2]
---
  gcc/config/aarch64/aarch64.md | 20 ++++++++++++++++++++
  1 file changed, 20 insertions(+)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index d3223e275c51..175aed3146ac 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -8495,6 +8495,26 @@
    [(set_attr "type" "memtag")]
  )
+;; ST2G with post-index writeback.
+(define_insn "*st2g_post"
+  [(set (mem:QI (unspec:DI
+        [(plus:DI (match_operand:DI 1 "register_operand" "=&rk")
+                  (const_int 0))]
+        UNSPEC_TAG_SPACE))
+       (and:QI (lshiftrt:DI (match_operand:DI 0 "register_operand" "rk")
+                            (const_int 56)) (const_int 15)))
+   (set (mem:QI (unspec:DI
+        [(plus:DI (match_dup 1) (const_int -16))]
+        UNSPEC_TAG_SPACE))
+       (and:QI (lshiftrt:DI (match_dup 0)
+                            (const_int 56)) (const_int 15)))
+    (set (match_dup 1)
+         (plus:DI (match_dup 1) (match_operand:DI 2 "aarch64_granule16_simm9" 
"i")))]
+  "TARGET_MEMTAG"
+  "st2g\\t%0, [%1], #%2"
+  [(set_attr "type" "memtag")]
+)
+
  ;; Load/Store 64-bit (LS64) instructions.
  (define_insn "ld64b"
    [(set (match_operand:V8DI 0 "register_operand" "=r")

Reply via email to