https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120624
Bug ID: 120624 Summary: aarch64: Incorrect DCE of a ZA restore in SME code Product: gcc Version: 16.0 Status: UNCONFIRMED Keywords: aarch64-sme, wrong-code Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: rsandifo at gcc dot gnu.org Target Milestone: --- Target: aarch64*-*-* For: #include <arm_sme.h> void callee(); __arm_new("za") __arm_locally_streaming int test() { svbool_t all = svptrue_b8(); svint8_t expected = svindex_s8(1, 1); svwrite_hor_za8_m(0, 0, all, expected); callee(); svint8_t actual = svread_hor_za8_m(svdup_s8(0), all, 0, 0); return svptest_any(all, svcmpne(all, expected, actual)); } compiled with -O2 -march=armv9-a+sme, GCC produces: bl callee() smstart sm msr tpidr2_el0, xzr ptrue p7.b, all mov w12, 0 index z31.b, #1, #1 mov z30.b, #0 mova z30.b, p7/m, za0h.b[w12, 0] This is missing the necessary conditional restore of ZA after the call to callee. Mode switching, which is responsible for inserting the saves and restores, does insert a restore into the correct place. However, the restore pattern is missing an important bit of dataflow information, which means that it can be incorrectly deleted as dead. Unfortunately, all the tests for this functionality used calls and asms to model dataflow, and those don't seem to be affected.