On Fri, Apr 27, 2012 at 12:07 AM, Igor Zamyatin <izamya...@gmail.com> wrote: > Are you sure that tree-level unrollers are turned on at O2? My > impression was that they work only at O3 or with f[unroll,peel]-loops > flags.
yes they are on but only have effect on tiny loops with very small trip count. With O3 or with -funroll,peel-loops, the size is allowed to grow. David > > On Tue, Apr 24, 2012 at 6:13 PM, Andi Kleen <a...@firstfloor.org> wrote: >> tejohn...@google.com (Teresa Johnson) writes: >> >>> This patch adds heuristics to limit unrolling in loops with branches >>> that may increase branch mispredictions. It affects loops that are >>> not frequently iterated, and that are nested within a hot region of code >>> that already contains many branch instructions. >>> >>> Performance tested with both internal benchmarks and with SPEC >>> 2000/2006 on a variety of Intel systems (Core2, Corei7, SandyBridge) and a >>> couple of different AMD Opteron systems. >>> This improves performance of an internal search indexing benchmark by >>> close to 2% on all the tested Intel platforms. It also consistently >>> improves 445.gobmk (with FDO feedback where unrolling kicks in) by >>> close to 1% on AMD Opteron. Other performance effects are neutral. >>> >>> Bootstrapped and tested on x86_64-unknown-linux-gnu. Is this ok for trunk? >> >> One problem with any unrolling heuristics is currently that gcc has >> both the tree level and the rtl level unroller. The tree one is even >> on at -O3. So if you tweak anything for one you have to affect both, >> otherwise the other may still do the wrong thing(tm). > > Tree level unrollers (cunrolli and cunroll) do complete unroll. At O2, > both of them are turned on, but gcc does not allow any code growth -- > which makes them pretty useless at O2 (very few loops qualify). The > default max complete peel iteration is also too low compared with both > icc and llvm. This needs to be tuned. > > David > >> >> For some other tweaks I looked into a shared cost model some time ago. >> May be still needed. >> >> -Andi >> >> -- >> a...@linux.intel.com -- Speaking for myself only