On Thu, Sep 24, 2015 at 11:56:22AM +0200, Bernd Schmidt wrote:
> On 09/24/2015 12:06 AM, Segher Boessenkool wrote:
> >The current basic block reordering always uses the "software trace cache"
> >algorithm.  That has a few problems:
> >
> >1) It increases code size substantially; this makes it not suitable for
> >-O1 or -Os, and not at all for some architectures;
> >2) but it is enabled for -Os and all targets;
> >3) and -O1 gets nothing, resulting in pretty jumpy code.
> 
> A general question first, I see code in bb-reorder.c (in copy_bb_p) that 
> limits the amount of code growth if not optimizing for speed. Is that 
> not working as expected or not sufficient?

It works.  The "simple" algorithm generates slightly smaller code though
(less than a percent).  Defaulting -Os to STC is easy of course; do you
prefer that?

> Your code looks like a nice clean algorithm so I have no objections to 
> it (detailed comments to follow), but I want to make sure it is 
> necessary to add it.

It's not just for -Os, but also for -O1 (where we currently don't reorder
at all, although various passes leave the config in a pretty sorry state --
like, we run shrink-wrapping at -O1, and it can make quite a mess if some
blocks are copied and others not; but this is just an example, it was the
trigger for me though).

And, when I wrote the original for this, it was for a target where STC
does not help at all (there is no instruction cache); "simple" saves a
lot of space at -O2.  Quite important for embedded targets.

Finally, it lets us easily plug in other algorithms.


Segher

Reply via email to