https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118055
--- Comment #4 from Hans-Peter Nilsson <hp at gcc dot gnu.org> --- (In reply to Hongtao Liu from comment #3) > > > > Is it perhaps that the test is brittle; mostly target-specific despite being > > at the tree-level and that instead the scan-test should be a specific > > known-matching target list? > > The testcase is used to detect load/store motion optimization which relies > on loop unrolling, my commit adjusted unroll heuritic to prevent some > "bad"(performance) unroll and breaks the testcase on some targets. > > For this testcase itself, (for some targets) it may be necessary to add > --param max-completely-peeled-insns=300 to ensure that unroll occurs. > > For performance perpective, the targets may need to Fine-tuning the > parameter of max-completely-peeled-insns according to benchmarks. I'll take that as a "yes" to my question. :) And "yes" seems correct; I made a quick analysis myself: the "number of insns" compared here quickly boils down to testing RTL-level target specifics, like MOVE_MAX. Note that the test involves moving a lot of 64-bit entitites. So, this is mostly a difference between "32-bit" and "64-bit" targets, and the target specifier should probably better reflect this than being an accumulated list of targets. There are exceptions to this MOVE_MAX=4 => "32-bit", MOVE_MAX=8 => "64-bit", like pru-elf (which I noticed did *not* regress in the posted results) which is "32-bit" but has #define MOVE_MAX 8.