https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120624

--- Comment #2 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The trunk branch has been updated by Richard Sandiford <rsand...@gcc.gnu.org>:

https://gcc.gnu.org/g:8546265e2ee386ea8a4b2f9150ddfed32c9d15ea

commit r16-1476-g8546265e2ee386ea8a4b2f9150ddfed32c9d15ea
Author: Richard Sandiford <richard.sandif...@arm.com>
Date:   Thu Jun 12 12:10:39 2025 +0100

    aarch64: Incorrect removal of ZA restore [PR120624]

    The PCS defines a lazy save scheme for managing ZA across normal
    "private-ZA" functions.  GCC currently uses this scheme for calls
    to all private-ZA functions (rather than using caller-save).

    Therefore, before a sequence of calls to private-ZA functions, GCC emits
    code to set up a lazy save.  After the sequence of calls, GCC emits code
    to check whether lazy save was committed and restore the ZA contents
    if so.

    These sequences are emitted by the mode-switching pass, in an attempt
    to reduce the number of redundant saves and restores.

    The lazy save scheme also means that, before a function can use ZA,
    it must first conditionally store the old contents of ZA to the caller's
    lazy save buffer, if any.

    This all creates some relatively complex dependencies between
    setup code, save/restore code, and normal reads from and writes to ZA.
    These dependencies are modelled using special fake hard registers:

        ;; Sometimes we use placeholder instructions to mark where later
        ;; ABI-related lowering is needed.  These placeholders read and
        ;; write this register.  Instructions that depend on the lowering
        ;; read the register.
        (LOWERING_REGNUM 87)

        ;; Represents the contents of the current function's TPIDR2 block,
        ;; in abstract form.
        (TPIDR2_BLOCK_REGNUM 88)

        ;; Holds the value that the current function wants PSTATE.ZA to be.
        ;; The actual value can sometimes vary, because it does not track
        ;; changes to PSTATE.ZA that happen during a lazy save and restore.
        ;; Those effects are instead tracked by ZA_SAVED_REGNUM.
        (SME_STATE_REGNUM 89)

        ;; Instructions write to this register if they set TPIDR2_EL0 to a
        ;; well-defined value.  Instructions read from the register if they
        ;; depend on the result of such writes.
        ;;
        ;; The register does not model the architected TPIDR2_ELO, just the
        ;; current function's management of it.
        (TPIDR2_SETUP_REGNUM 90)

        ;; Represents the property "has an incoming lazy save been committed?".
        (ZA_FREE_REGNUM 91)

        ;; Represents the property "are the current function's ZA contents
        ;; stored in the lazy save buffer, rather than in ZA itself?".
        (ZA_SAVED_REGNUM 92)

        ;; Represents the contents of the current function's ZA state in
        ;; abstract form.  At various times in the function, these contents
        ;; might be stored in ZA itself, or in the function's lazy save buffer.
        ;;
        ;; The contents persist even when the architected ZA is off. 
Private-ZA
        ;; functions have no effect on its contents.
        (ZA_REGNUM 93)

    Every normal read from ZA and write to ZA depends on SME_STATE_REGNUM,
    in order to sequence the code with the initial setup of ZA and
    with the lazy save scheme.

    The code to restore ZA after a call involves several instructions,
    including conditional control flow.  It is initially represented as
    a single define_insn and is split late, after shrink-wrapping and
    prologue/epilogue insertion.

    The split form of the restore instruction includes a conditional call
    to __arm_tpidr2_restore:

    (define_insn "aarch64_tpidr2_restore"
      [(set (reg:DI ZA_SAVED_REGNUM)
            (unspec:DI [(reg:DI R0_REGNUM)] UNSPEC_TPIDR2_RESTORE))
       (set (reg:DI SME_STATE_REGNUM)
            (unspec:DI [(reg:DI SME_STATE_REGNUM)] UNSPEC_TPIDR2_RESTORE))
      ...
    )

    The write to SME_STATE_REGNUM indicates the end of the region where
    ZA_REGNUM might differ from the real contents of ZA.  In other words,
    it is the point at which normal reads from ZA and writes to ZA
    can safely take place.

    To finally get to the point, the problem in this PR was that the
    unsplit aarch64_restore_za pattern was missing this change to
    SME_STATE_REGNUM.  It could therefore be deleted as dead before
    it had chance to be split.  The split form had the correct dataflow,
    but the unsplit form didn't.

    Unfortunately, the tests for this code tended to use calls and asms
    to model regions of ZA usage, and those don't seem to be affected
    in the same way.

    gcc/
            PR target/120624
            * config/aarch64/aarch64.md (SME_STATE_REGNUM): Expand on comments.
            * config/aarch64/aarch64-sme.md (aarch64_restore_za): Also set
            SME_STATE_REGNUM

    gcc/testsuite/
            PR target/120624
            * gcc.target/aarch64/sme/za_state_7.c: New test.

Reply via email to