Code size for spec2000 is almost unchanged (many benchmarks have the same binaries). For those that are changed we have the following numbers (200 vs 100, both dynamic build -Ofast -funroll-loops -flto): 183.equake +10% 164.gzip, 173.applu +3,5% 187.facerec, 191.fma3d +2,5% 200.sixstrack +2% 177.mesa, 178.galgel +1%
On Wed, Nov 12, 2014 at 2:51 AM, Jan Hubicka <hubi...@ucw.cz> wrote: >> > 150 and 200 make Silvermont performance better on 173.applu (+8%) and >> > 183.equake (+3%); Haswell spec2006 performance stays almost unchanged. >> > Higher value of 300 leave the performance of mentioned tests >> > unchanged, but add some regressions on other benchmarks. >> > >> > So I like 200 as well as 120 and 150, but can confirm performance >> > gains only for x86. >> >> IMO it's either 150 or 200. We chose 200 for our 4.9-based compiler because >> this gave the performance boost without affecting the code size (on x86-64) >> and because this was previously 400, but it's your call. > > Both 150 or 200 globally work for me if there is not too much of code size > bloat (did not see code size mentioned here). > > What I did before decreasing the bounds was strenghtening the loop iteraton > count bounds and adding logic the predicts constant propagation enabled by > unrolling. For this reason 400 became too large as we did a lot more complete > unrolling than before. Also 400 in older compilers is not really 400 in newer. > > Because I saw performance to drop only with values bellow 50, I went for 100. > It would be very interesting to actually analyze what happends for those two > benchmarks (that should not be too hard with perf). > > Honza