On 22 Mar 2010, at 17:44, Ian Bolton wrote: >> Enabling BB-reorder only if profile info is available, is not the >> right way to go. The compiler really doesn't place blocks in sane >> places without it -- and it shouldn't have to, either. For example if >> you split an edge at some point, the last thing you want to worry >> about, is where the new basic block is going to end up. >> >> There are actually a few bugs in Bugzilla about BB-reorder, FWIW. > > I've done a few searches in Bugzilla and am not sure if I have found > the BB reorder bugs you are referring to. > > The ones I have found are: > > 16797: Opportunity to remove unnecessary load instructions > 41396: missed space optimization related to basic block reorder > 21002: RTL prologue and basic-block reordering pessimizes delay-slot > filling. > > (If you can recall any others, I'd appreciate hearing of them.) > > Based on 41396, it looks like BB reorder is disabled for -Os. But > you said in your post above that "the compiler really doesn't place > blocks in sane places without it", so does that mean that we could > probably increase performance for -Os if BB reorder was (improved) > and enabled for -Os?
Back with our old gcc 3.4 compiler we used to routinely compile our code -Os but with BB reordering enabled as it gave us a significant performance gain for a very small increase in code size (less than 2% code size impact from what I remember versus about a 5% performance win). With gcc 4.4 (where we are until 4.5 is out) I've been constantly frustrated by not being able to do BB reordering at -Os but equally our code sizes at -O2 have steadily shrunk so that it's only about 10% larger than -Os if we disable cache-line-aligning functions (but where -O2 performance is often in the range of 15% to 30% faster). I seem to remember some suggestions in the past that we might want something like a -Os2 that would generally optimize for size but would still enable some number of small code size expansions where the performance benefit was large (and BB reordering would be my favourite such case) - that's the optimization setting I'd like to see us use for almost everything at Ubicom. Cheers, Dave