Hi, On Sun, Mar 29, 2015 at 05:43:20PM +0200, Jan Hubicka wrote: > Hi,
thanks for committing the patch, I'm just wondering why... > this patch improve crafty performance by avoiding ipa-cp clonning of > Search function that specializes the first iteration of the recursion. > The patch is by Martin, I only tested it and cleaned up code in count_callers > and set_single_call_flag > > Bootstrapped/regtested x86_64-linux, comitted. > PR ipa/65478 > * params.def (PARAM_IPA_CP_RECURSION_PENALTY) : New. > (PARAM_IPA_CP_SINGLE_CALL_PENALTY): Likewise. > * ipa-prop.h (ipa_node_params): New flags node_within_scc and > node_calling_single_call. > * ipa-cp.c (count_callers): New function. > (set_single_call_flag): Likewise. > (initialize_node_lattices): Count callers and set single_flag_call if > necessary. > (incorporate_penalties): New function. > (good_cloning_opportunity_p): Use it, dump new flags. > (propagate_constants_topo): Set node_within_scc flag if appropriate. > * doc/invoke.texi (ipa-cp-recursion-penalty, > ipa-cp-single-call-pentalty): Document. > Index: params.def > =================================================================== > --- params.def (revision 221757) > +++ params.def (working copy) > @@ -999,6 +999,18 @@ DEFPARAM (PARAM_IPA_CP_EVAL_THRESHOLD, > "beneficial to clone.", > 500, 0, 0) > > +DEFPARAM (PARAM_IPA_CP_RECURSION_PENALTY, > + "ipa-cp-recursion-penalty", > + "Percentage penalty the recursive functions will receive when they " > + "are evaluated for cloning.", > + 40, 0, 100) > + > +DEFPARAM (PARAM_IPA_CP_SINGLE_CALL_PENALTY, > + "ipa-cp-single-call-penalty", > + "Percentage penalty functions containg a single call to another " > + "function will receive when they are evaluated for cloning.", > + 15, 0, 100) > + > DEFPARAM (PARAM_IPA_MAX_AGG_ITEMS, > "ipa-max-agg-items", > "Maximum number of aggregate content items for a parameter in " > Index: ipa-prop.h > =================================================================== > --- ipa-prop.h (revision 221757) > +++ ipa-prop.h (working copy) > @@ -330,6 +330,10 @@ struct ipa_node_params > /* Node has been completely replaced by clones and will be removed after > ipa-cp is finished. */ > unsigned node_dead : 1; > + /* Node is involved in a recursion, potentionally indirect. */ > + unsigned node_within_scc : 1; > + /* Node is calling a private function called only once. */ > + unsigned node_calling_single_call : 1; > }; > > /* ipa_node_params access functions. Please use these to access fields that > Index: ipa-cp.c > =================================================================== > --- ipa-cp.c (revision 221757) > +++ ipa-cp.c (working copy) > @@ -811,6 +811,41 @@ set_all_contains_variable (struct ipcp_p > return ret; > } > > +/* Worker of call_for_symbol_thunks_and_aliases, increment the integer DATA > + points to by the number of callers to NODE. */ > + > +static bool > +count_callers (cgraph_node *node, void *data) > +{ > + int *caller_count = (int *) data; > + > + for (cgraph_edge *cs = node->callers; cs; cs = cs->next_caller) > + /* Local thunks can be handled transparently, but if the thunk can not > + be optimized out, count it as a real use. */ > + if (!cs->caller->thunk.thunk_p || !cs->caller->local.local) > + ++*caller_count; ...we don't recurse for local thunks and do not count their number of callers here. I'm not sure if it can ever happen in practice but if we have a local function that is called multiple times but always through one particular local thunk, we should end with caller_count greater than one, and with this code we will not? Such function certainly should not be considered called just once by the inliner. Similarly, if we ever have a local node with a non-local thunk (would that be legal and make sense?), then we should try not ending up with caller_count == 1, because inliner will also never consider that function just called once. So I'd suggest incrementing the caller by two instead in that case. But maybe I just don't understand what "handled transparently" in the comment means. Thanks for clearing this up, Martin