https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120624

            Bug ID: 120624
           Summary: aarch64: Incorrect DCE of a ZA restore in SME code
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Keywords: aarch64-sme, wrong-code
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rsandifo at gcc dot gnu.org
  Target Milestone: ---
            Target: aarch64*-*-*

For:

#include <arm_sme.h>

void callee();

__arm_new("za") __arm_locally_streaming int test()
{
  svbool_t all = svptrue_b8();
  svint8_t expected = svindex_s8(1, 1);
  svwrite_hor_za8_m(0, 0, all, expected);

  callee();

  svint8_t actual = svread_hor_za8_m(svdup_s8(0), all, 0, 0);
  return svptest_any(all, svcmpne(all, expected, actual));
}

compiled with -O2 -march=armv9-a+sme, GCC produces:

        bl      callee()
        smstart sm
        msr     tpidr2_el0, xzr
        ptrue   p7.b, all
        mov     w12, 0
        index   z31.b, #1, #1
        mov     z30.b, #0
        mova    z30.b, p7/m, za0h.b[w12, 0]

This is missing the necessary conditional restore of ZA after the call to
callee.

Mode switching, which is responsible for inserting the saves and restores, does
insert a restore into the correct place.  However, the restore pattern is
missing an important bit of dataflow information, which means that it can be
incorrectly deleted as dead.

Unfortunately, all the tests for this functionality used calls and asms to
model dataflow, and those don't seem to be affected.

Reply via email to