Hi Martin, I have been playing with --param ipa-cp-large-unit-insns but it doesn't seem to have any meaningful effect on exchange2 and I still can't recover the 12% regression vs GCC 10.
Do I need to use another parameter here? Thanks, Tamar > -----Original Message----- > From: Gcc-patches <gcc-patches-boun...@gcc.gnu.org> On Behalf Of Martin > Jambor > Sent: Monday, September 21, 2020 3:25 PM > To: GCC Patches <gcc-patches@gcc.gnu.org> > Cc: Jan Hubicka <j...@suse.cz> > Subject: [PATCH 6/6] ipa-cp: Separate and increase the large-unit parameter > > A previous patch in the series has taught IPA-CP to identify the important > cloning opportunities in 548.exchange2_r as worthwhile on their own, but > the optimization is still prevented from taking place because of the overall > unit-growh limit. This patches raises that limit so that it takes place and > the > benchmark runs 30% faster (on AMD > Zen2 CPU at least). > > Before this patch, IPA-CP uses the following formulae to arrive at the > overall_size limit: > > base = MAX(orig_size, param_large_unit_insns) unit_growth_limit = base + > base * param_ipa_cp_unit_growth / 100 > > since param_ipa_cp_unit_growth has default 10, param_large_unit_insns > has default value 10000. > > The problem with exchange2 (at least on zen2 but I have had a quick look on > aarch64 too) is that the original estimated unit size is 10513 and so > param_large_unit_insns does not apply and the default limit is therefore > 11564 which is good enough only for one of the ideal 8 clonings, we need the > limit to be at least 16291. > > I would like to raise param_ipa_cp_unit_growth a little bit more soon too, > but most certainly not to 55. Therefore, the large_unit must be increased. > In > this patch, I decided to decouple the inlining and ipa-cp large-unit > parameters. > It also makes sense because IPA-CP uses it only at -O3 while inlining also at > - > O2 (IIUC). But if we agree we can try raising param_large_unit_insns to 13-14 > thousand "instructions," perhaps it is not necessary. But then again, it may > make sense to actually increase the IPA-CP limit further. > > I plan to experiment with IPA-CP tuning on a larger set of programs. > Meanwhile, mainly to address the 548.exchange2_r regression, I'm > suggesting this simple change. > > gcc/ChangeLog: > > 2020-09-07 Martin Jambor <mjam...@suse.cz> > > * params.opt (ipa-cp-large-unit-insns): New parameter. > * ipa-cp.c (get_max_overall_size): Use the new parameter. > --- > gcc/ipa-cp.c | 2 +- > gcc/params.opt | 4 ++++ > 2 files changed, 5 insertions(+), 1 deletion(-) > > diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c index 12acf24c553..2152f9e5876 100644 > --- a/gcc/ipa-cp.c > +++ b/gcc/ipa-cp.c > @@ -3448,7 +3448,7 @@ static long > get_max_overall_size (cgraph_node *node) { > long max_new_size = orig_overall_size; > - long large_unit = opt_for_fn (node->decl, param_large_unit_insns); > + long large_unit = opt_for_fn (node->decl, > + param_ipa_cp_large_unit_insns); > if (max_new_size < large_unit) > max_new_size = large_unit; > int unit_growth = opt_for_fn (node->decl, param_ipa_cp_unit_growth); > diff --git a/gcc/params.opt b/gcc/params.opt index > acb59f17e45..9d177ab50ad 100644 > --- a/gcc/params.opt > +++ b/gcc/params.opt > @@ -218,6 +218,10 @@ Percentage penalty functions containing a single call > to another function will r Common Joined UInteger > Var(param_ipa_cp_unit_growth) Init(10) Param Optimization How much can > given compilation unit grow because of the interprocedural constant > propagation (in percent). > > +-param=ipa-cp-large-unit-insns= > +Common Joined UInteger Var(param_ipa_cp_large_unit_insns) > Optimization > +Init(16000) Param The size of translation unit that IPA-CP pass considers > large. > + > -param=ipa-cp-value-list-size= > Common Joined UInteger Var(param_ipa_cp_value_list_size) Init(8) Param > Optimization Maximum size of a list of values associated with each > parameter for interprocedural constant propagation. > -- > 2.28.0