On Wed, Sep 13, 2017 at 3:21 AM, Michael Clark <michaeljcl...@mac.com> wrote: > >> On 13 Sep 2017, at 1:15 PM, Michael Clark <michaeljcl...@mac.com> wrote: >> >> - https://rv8.io/bench#optimisation >> - https://rv8.io/bench#executable-file-sizes >> >> -O2 is 98% perf of -O3 on x86-64 >> -Os is 81% perf of -O3 on x86-64 >> >> -O2 saves 5% space on -O3 on x86-64 >> -Os saves 8% space on -Os on x86-64 >> >> 17% drop in performance for 3% saving in space is not a good trade for a >> “general” size optimisation. It’s more like executable compression. > > Sorry fixed typo: > > -O2 is 98% perf of -O3 on x86-64 > -Os is 81% perf of -O3 on x86-64 > > -O2 saves 5% space on -O3 on x86-64 > -Os saves 8% space on -O3 on x86-64 > > The extra ~3% space saving for ~17% drop in performance doesn’t seem like a > good general option for size based on the cost in performance. > > Again. I really like GCC’s -O2 and hope that its binaries don’t grow in size > nor slow down.
I think with GCC -Os and -O2 are essentially the same with the difference that -Os assumes regions are cold and thus to be optimized for size and -O2 assumes they are hot and thus to be optimized for speed in cases there is not heuristic proving otherwise. I know this doesn't 100% reflect implementation reality but it should be close. IMHO we should turn on flags we turn on with -fprofile-use and have some more nuances in optimize_*_for_{speed,size} as we now track profile quality more closely. I see -O1 as mostly worthless unless you are compiling machine-generated code that makes -O2+ go OOM/time. Apart from avoiding quadratic or worse algorithms -O1 sees no love. On its own -O3 doesn't add much (some loop opts and slightly more aggressive inlining/unrolling), so whatever it does we should consider doing at -O2 eventually. Richard.