On Thu, Sep 24, 2015 at 11:56:22AM +0200, Bernd Schmidt wrote: > On 09/24/2015 12:06 AM, Segher Boessenkool wrote: > >The current basic block reordering always uses the "software trace cache" > >algorithm. That has a few problems: > > > >1) It increases code size substantially; this makes it not suitable for > >-O1 or -Os, and not at all for some architectures; > >2) but it is enabled for -Os and all targets; > >3) and -O1 gets nothing, resulting in pretty jumpy code. > > A general question first, I see code in bb-reorder.c (in copy_bb_p) that > limits the amount of code growth if not optimizing for speed. Is that > not working as expected or not sufficient?
It works. The "simple" algorithm generates slightly smaller code though (less than a percent). Defaulting -Os to STC is easy of course; do you prefer that? > Your code looks like a nice clean algorithm so I have no objections to > it (detailed comments to follow), but I want to make sure it is > necessary to add it. It's not just for -Os, but also for -O1 (where we currently don't reorder at all, although various passes leave the config in a pretty sorry state -- like, we run shrink-wrapping at -O1, and it can make quite a mess if some blocks are copied and others not; but this is just an example, it was the trigger for me though). And, when I wrote the original for this, it was for a target where STC does not help at all (there is no instruction cache); "simple" saves a lot of space at -O2. Quite important for embedded targets. Finally, it lets us easily plug in other algorithms. Segher