The following the O2 size data from SPEC2k. Note that with push/pop, it is a always a net win (negative delta) in terms of total binary or total loadable section size.
thanks, David .text .eh_frame Total_binary vortex-move 440252 40796 584066 vortex-push 415436 57452 575906 delta -5.6% 40.8% -1.397% twolf-move 169324 10748 223521 twolf-push 168876 11124 223449 delta -0.3% 3.5% -0.032% gzip-move 30668 3652 374399 gzip-push 30524 3740 374343 delta -0.5% 2.4% -0.015% bzip2-move 22748 3196 111616 bzip2-push 22636 3284 111592 delta -0.5% 2.8% -0.022% vpr-move 104684 9380 147378 vpr-push 104236 9788 147338 delta -0.4% 4.3% -0.027% mcf-move 8444 1244 26760 mcf-push 8444 1244 26760 delta 0.0% 0.0% 0.000% cc1-move 1093964 90772 1576994 cc1-push 1078988 104068 1575314 delta -1.4% 14.6% -0.107% crafty-move 130556 5508 1256037 crafty-push 130236 5772 1255981 delta -0.2% 4.8% -0.004% eon-move 333660 33220 516491 eon-push 330140 35812 515555 delta -1.1% 7.8% -0.181% gap-move 404092 46732 1457735 gap-push 396012 53180 1456103 delta -2.0% 13.8% -0.112% perlbmk-move 456572 45324 618585 perlbmk-push 449516 52340 618545 delta -1.5% 15.5% -0.006% parser-move 81244 15788 334003 parser-push 80684 16332 333987 delta -0.7% 3.4% -0.005% On Tue, Dec 11, 2012 at 9:14 AM, Xinliang David Li <davi...@google.com> wrote: > On Tue, Dec 11, 2012 at 1:49 AM, Richard Biener > <richard.guent...@gmail.com> wrote: >> On Mon, Dec 10, 2012 at 10:07 PM, Mike Stump <mikest...@comcast.net> wrote: >>> On Dec 10, 2012, at 12:42 PM, Xinliang David Li <davi...@google.com> wrote: >>>> I have not measured the CFI size impact -- but conceivably it should >>>> be larger -- which is unfortunate. >>> >>> Code speed and size are preferable to optimizing dwarf size… :-) I'd let >>> dwarf 5 fix it! >> >> Well, different to debug info, CFI data has to be in memory to make >> unwinding work. >> These days most Linux distributions enable asyncronous unwind tables so any >> size savings due to shorter push/pop epilogue/prologue sequences has to be >> offsetted by the increase in CFI data. I'm not sure there is really a >> speed difference >> between both variants (well, maybe due to better icache footprint of >> the push/pop >> variant). > > Yes, for large applications, this can be crucial to performance. > >> >> That said - I'd prefer to have more data on this before making the switch for >> the generic model. What was your original motivation? Just "theory" or was >> it a real case? > > 1) some of the very large internal apps I measured benefit from this > change (in terms of performance) > 2) both ICC and LLVM do the same. > > I have already committed the patch. I will find some time to collect > more size data and post it later. > > thanks, > > David > > >> >> Thanks, >> Richard.