https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118055

--- Comment #4 from Hans-Peter Nilsson <hp at gcc dot gnu.org> ---
(In reply to Hongtao Liu from comment #3)
> > 
> > Is it perhaps that the test is brittle; mostly target-specific despite being
> > at the tree-level and that instead the scan-test should be a specific
> > known-matching target list?
> 
> The testcase is used to detect load/store motion optimization which relies
> on loop unrolling, my commit adjusted unroll heuritic to prevent some
> "bad"(performance) unroll and breaks the testcase on some targets.
> 
> For this testcase itself, (for some targets) it may be necessary to add
> --param max-completely-peeled-insns=300 to ensure that unroll occurs.
> 
> For performance perpective, the targets may need to Fine-tuning the
> parameter of max-completely-peeled-insns according to benchmarks.

I'll take that as a "yes" to my question. :)

And "yes" seems correct; I made a quick analysis myself: the "number of insns"
compared here quickly boils down to testing RTL-level target specifics, like
MOVE_MAX.  Note that the test involves moving a lot of 64-bit entitites.

So, this is mostly a difference between "32-bit" and "64-bit" targets, and the
target specifier should probably better reflect this than being an accumulated
list of targets.  There are exceptions to this MOVE_MAX=4 => "32-bit",
MOVE_MAX=8 => "64-bit", like pru-elf (which I noticed did *not* regress in the
posted results) which is "32-bit" but has #define MOVE_MAX 8.

Reply via email to