Indu Bhagat <indu.bha...@oracle.com> writes:
> Store Allocation Tags (st2g) is an Armv8.5-A memory tagging (MTE)
> instruction. It stores an allocation tag to two tag granules of memory.
>
> TBD:
>   - Not too sure what is the best way to generate the st2g yet; A
>     subsequent patch will emit them in one of the target hooks.

Regarding the previous thread about this:

    https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668671.html

and your question about whether all types of store tag instructions
should be volatile: if we went for that approach, then yeah, I think so.

As I mentioned there, I don't think we should use (unspec ...) memory
addresses.

But thinking more about it: can we guarantee that GCC will only use
these instruction patterns with base registers that are aligned to
16 bytes?  If so, then perhaps an alternative would be to model
them as read-modify-write operations to the whole granule (even though
the actual instructions leave normal memory untouched and only change
the tags).  That is, rather than:

>
> gcc/ChangeLog:
>
>       * config/aarch64/aarch64.md (st2g): New definition.
> ---
>  gcc/config/aarch64/aarch64.md | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
>
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 0c7aebb838cd..d3223e275c51 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -8475,6 +8475,26 @@
>    [(set_attr "type" "memtag")]
>  )
>  
> +;; ST2G updates allocation tags for two memory granules (i.e. 32 bytes) at
> +;; once, without zero initialization.
> +(define_insn "st2g"
> +  [(set (mem:QI (unspec:DI
> +      [(plus:DI (match_operand:DI 1 "register_operand" "rk")
> +                (match_operand:DI 2 "aarch64_granule16_simm9" "i"))]
> +      UNSPEC_TAG_SPACE))
> +     (and:QI (lshiftrt:DI (match_operand:DI 0 "register_operand" "rk")
> +                          (const_int 56)) (const_int 15)))
> +   (set (mem:QI (unspec:DI
> +      [(plus:DI (match_dup 1)
> +                (match_operand:DI 3 "aarch64_granule16_simm9" "i"))]
> +      UNSPEC_TAG_SPACE))
> +     (and:QI (lshiftrt:DI (match_dup 0)
> +                          (const_int 56)) (const_int 15)))]
> +  "TARGET_MEMTAG && (INTVAL (operands[2]) - 16 == INTVAL (operands[3]))"
> +  "st2g\\t%0, [%1, #%2]"
> +  [(set_attr "type" "memtag")]
> +)
> +

...this, we could do:

(set (match_operand:OI 0 "aarch64_granule_memory_operand" "+<new constraint>")
     (unspec_volatile:OI
       [(match_dup 0)
        (match_operand:DI 1 "register_operand" "rk")]
       UNSPECV...))

Using OImode (256 bytes) indicates that two full granules are affected
by the store, but that no other memory is affected.  The (match_dup 0)
read indicates that this store does not kill any previous store to the
same 256 bytes (since the contents of normal memory don't change).
The unspec_volatile should ensure that nothing tries to remove the
store as dead (which would especially be a problem when clearing tags).

Using a single memory operand for the whole instruction has the advantage
of only requiring one offset to be represented, rather than having both
operands 2 and 3 in the original pattern.  It also copes more easily
with cases where the offset is zero for the first or second address,
since no (plus ...) should be present in that case.

Thanks,
Richard

Reply via email to