https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98350
--- Comment #4 from Di Zhao <dizhao at os dot amperecomputing.com> --- I've found the same problem with gcc-12 and gcc-13 (trunk). By improving the workaround in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84114, more FMAs can be inserted for vector mode. For the testcase in this tracker, 6 "fmla" can be generated with attachment 54735. The compile option I used is "-Ofast -mcpu=neoverse-n1".