On 9/6/19 4:58 AM, Graham Markall wrote: > This patch is an RFC to invite comments as to whether the approach > to solving the problem is a suitable one for upstream GCC, or whether > there are alternative approaches that would be recommended. > > Motivation > ---------- > > We have observed that in some cases the estimation of callee growth for > inlining particular functions can be tuned for better overall code size > with particular programs on particular targets. Although modification of > the heuristics to make a general improvement is a difficult problem to > tackle, simply biasing the growth by a fixed amount can lead to > improvements in code size within the context of a particular program. > > This has first been tested on a proprietary program, where setting the > growth bias to -2 resulted in a saving of 1.35% in the text section size > (62396 bytes as opposed to 63252 bytes). Using the Embench suite ( > https://www.embench.org/ ) for a riscv32 bare-metal target with -Os also > shows that adjusting the inline growth estimate carefully can also > reduce code size. Results presented here are percentages relative to the > Embench reference platform (which is the standard way of presenting > Embench results) for values of the bias from -2 to 2: > > Benchmark -2 -1 0 1 2 > aha-mont64 99.05 99.05 99.05 99.05 99.05 > crc32 100.00 100.00 100.00 100.00 100.00 > cubic 94.66 94.66 94.66 94.66 94.66 > edn 100.00 100.00 100.00 100.00 100.00 > huffbench 99.88 99.88 99.88 99.88 99.88 > matmult-int 100.00 100.00 100.00 102.86 102.86 > minver 97.96 97.96 97.96 106.88 106.88 > nbody 98.87 98.87 98.87 98.87 98.87 > nettle-aes 99.31 99.10 99.10 99.10 99.10 > nettle-sha256 99.89 99.89 99.89 99.89 99.89 > nsichneu 100.00 100.00 100.00 100.00 100.00 > picojpeg 102.56 102.54 99.73 99.38 99.28 > qrduino 99.70 99.70 99.90 99.90 99.90 > sglib-combined 95.52 100.00 100.00 100.43 101.12 > slre 100.66 100.66 99.09 98.85 98.85 > st 96.36 96.36 96.36 96.82 96.82 > statemate 100.00 100.00 100.00 100.00 100.00 > ud 100.00 100.00 100.00 100.00 100.00 > wikisort 99.48 99.48 99.48 99.48 99.48 > Mean 99.14 99.36 99.15 99.77 99.80 > > In most cases, leaving the bias at 0 (unmodified) produces the smallest > code. However, there are some cases where an alternative value prevails, > The "Best Diff" column shows the reduction in size compared to leaving > the bias at 0, for cases where changing it yielded an improvement: > > Benchmark Best Worst Best diff > aha-mont64 0 0 > crc32 0 0 > cubic 0 0 > edn 0 0 > huffbench 0 0 > matmult-int 0 1 / 2 > minver 0 1 / 2 > nbody 0 0 > nettle-aes 0 -2 > nettle-sha256 0 0 > nsichneu 0 0 > picojpeg 2 -2 -0.45% > qrduino -1 /-2 0 -0.20% > sglib-combined -2 1 -4.48% > slre 1 / 2 -1 / -2 -0.24% > st 0 1 / 2 > statemate 0 0 > ud 0 0 > wikisort 0 0 > > > In summary, for this test setup: > > - In most cases, leaving the growth bias at 0 is optimal. > - Biasing the growth estimate positively may either increase or decrease > code size. > - Biasing the estimate negatively may also either increase or decrease > code size. > - In a small number of cases, biasing the growth estimate improves code > size (up to 4.48% smaller for sglib-combined). > > > Patch > ----- > > The growth bias can be set with a flag: > > -finline-growth-bias=<number> > > which controls the bias (an increase or decrease) of the inline growth > in ipa-inline.c. In cases where the bias would result in a negative > function size, we clip the growth estimate so that adding the growth to > the size of the function results in a size of 0, by setting the growth > to the negative of the function size. > > There is not a great deal of validation of the argument - if a > non-integer is passed as the parameter (e.g. -finline-growth-bias=xyz), > it will be as if the parameter were 0. More validation could be added if > necessary. It seemed to me that GCC's infrastructure doesn't seem to > anticipate option values / parameters that could contain negative > values. For parameters, -1 seemed to represent an error and could result > in an ICE (-2 etc. pass through OK though). For options, I looked at the > UInteger and Host_Wide_Int numeric types, but they both expect a > positive integer. I did consider extending this with an Integer type > that could accept positive and negative integers, (e.g. starting with > augmenting switch_bit_fields in opt-functions.awk) but rooting out > assumptions of non-negative integer values in places further down seemed > like an onerous task. > > > Further discussion > ------------------ > > Given that a default setting of 0 (equivalent to current behaviour) > should provide for some potential improvement in code size in individual > situations where it is critical (albeit using a rather coarse instrument > to do so), without affecting the general case - is the addition of such > a flag to GCC likely to be considered an acceptable inclusion for this > purpose? > > > gcc/ChangeLog: > > * common.opt: Add -finline-growth-bias= flag. > * ipa-inline.h (estimate_edge_growth): Adjust inline > growth by bias factor, if supplied. I'm not sure if we really want to expose this as a first class option. Perhaps a PARAM would be better. Jan would have the final call though since he's done the most work on the IPA bits.
Jeff