Indu Bhagat <indu.bha...@oracle.com> writes: > Store Allocation Tags (st2g) is an Armv8.5-A memory tagging (MTE) > instruction. It stores an allocation tag to two tag granules of memory. > > TBD: > - Not too sure what is the best way to generate the st2g yet; A > subsequent patch will emit them in one of the target hooks.
Regarding the previous thread about this: https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668671.html and your question about whether all types of store tag instructions should be volatile: if we went for that approach, then yeah, I think so. As I mentioned there, I don't think we should use (unspec ...) memory addresses. But thinking more about it: can we guarantee that GCC will only use these instruction patterns with base registers that are aligned to 16 bytes? If so, then perhaps an alternative would be to model them as read-modify-write operations to the whole granule (even though the actual instructions leave normal memory untouched and only change the tags). That is, rather than: > > gcc/ChangeLog: > > * config/aarch64/aarch64.md (st2g): New definition. > --- > gcc/config/aarch64/aarch64.md | 20 ++++++++++++++++++++ > 1 file changed, 20 insertions(+) > > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md > index 0c7aebb838cd..d3223e275c51 100644 > --- a/gcc/config/aarch64/aarch64.md > +++ b/gcc/config/aarch64/aarch64.md > @@ -8475,6 +8475,26 @@ > [(set_attr "type" "memtag")] > ) > > +;; ST2G updates allocation tags for two memory granules (i.e. 32 bytes) at > +;; once, without zero initialization. > +(define_insn "st2g" > + [(set (mem:QI (unspec:DI > + [(plus:DI (match_operand:DI 1 "register_operand" "rk") > + (match_operand:DI 2 "aarch64_granule16_simm9" "i"))] > + UNSPEC_TAG_SPACE)) > + (and:QI (lshiftrt:DI (match_operand:DI 0 "register_operand" "rk") > + (const_int 56)) (const_int 15))) > + (set (mem:QI (unspec:DI > + [(plus:DI (match_dup 1) > + (match_operand:DI 3 "aarch64_granule16_simm9" "i"))] > + UNSPEC_TAG_SPACE)) > + (and:QI (lshiftrt:DI (match_dup 0) > + (const_int 56)) (const_int 15)))] > + "TARGET_MEMTAG && (INTVAL (operands[2]) - 16 == INTVAL (operands[3]))" > + "st2g\\t%0, [%1, #%2]" > + [(set_attr "type" "memtag")] > +) > + ...this, we could do: (set (match_operand:OI 0 "aarch64_granule_memory_operand" "+<new constraint>") (unspec_volatile:OI [(match_dup 0) (match_operand:DI 1 "register_operand" "rk")] UNSPECV...)) Using OImode (256 bytes) indicates that two full granules are affected by the store, but that no other memory is affected. The (match_dup 0) read indicates that this store does not kill any previous store to the same 256 bytes (since the contents of normal memory don't change). The unspec_volatile should ensure that nothing tries to remove the store as dead (which would especially be a problem when clearing tags). Using a single memory operand for the whole instruction has the advantage of only requiring one offset to be represented, rather than having both operands 2 and 3 in the original pattern. It also copes more easily with cases where the offset is zero for the first or second address, since no (plus ...) should be present in that case. Thanks, Richard