Hi,

On Sun, Mar 29, 2015 at 05:43:20PM +0200, Jan Hubicka wrote:
> Hi,

thanks for committing the patch, I'm just wondering why...

> this patch improve crafty performance by avoiding ipa-cp clonning of
> Search function that specializes the first iteration of the recursion.
> The patch is by Martin, I only tested it and cleaned up code in count_callers
> and set_single_call_flag
> 
> Bootstrapped/regtested x86_64-linux, comitted.
>       PR ipa/65478
>       * params.def (PARAM_IPA_CP_RECURSION_PENALTY) : New.
>       (PARAM_IPA_CP_SINGLE_CALL_PENALTY): Likewise.
>       * ipa-prop.h (ipa_node_params): New flags node_within_scc and
>       node_calling_single_call.
>       * ipa-cp.c (count_callers): New function.
>       (set_single_call_flag): Likewise.
>       (initialize_node_lattices): Count callers and set single_flag_call if
>       necessary.
>       (incorporate_penalties): New function.
>       (good_cloning_opportunity_p): Use it, dump new flags.
>       (propagate_constants_topo): Set node_within_scc flag if appropriate.
>       * doc/invoke.texi (ipa-cp-recursion-penalty,
>       ipa-cp-single-call-pentalty): Document.
> Index: params.def
> ===================================================================
> --- params.def        (revision 221757)
> +++ params.def        (working copy)
> @@ -999,6 +999,18 @@ DEFPARAM (PARAM_IPA_CP_EVAL_THRESHOLD,
>         "beneficial to clone.",
>         500, 0, 0)
>  
> +DEFPARAM (PARAM_IPA_CP_RECURSION_PENALTY,
> +       "ipa-cp-recursion-penalty",
> +       "Percentage penalty the recursive functions will receive when they "
> +       "are evaluated for cloning.",
> +       40, 0, 100)
> +
> +DEFPARAM (PARAM_IPA_CP_SINGLE_CALL_PENALTY,
> +       "ipa-cp-single-call-penalty",
> +       "Percentage penalty functions containg a single call to another "
> +       "function will receive when they are evaluated for cloning.",
> +       15, 0, 100)
> +
>  DEFPARAM (PARAM_IPA_MAX_AGG_ITEMS,
>         "ipa-max-agg-items",
>         "Maximum number of aggregate content items for a parameter in "
> Index: ipa-prop.h
> ===================================================================
> --- ipa-prop.h        (revision 221757)
> +++ ipa-prop.h        (working copy)
> @@ -330,6 +330,10 @@ struct ipa_node_params
>    /* Node has been completely replaced by clones and will be removed after
>       ipa-cp is finished.  */
>    unsigned node_dead : 1;
> +  /* Node is involved in a recursion, potentionally indirect.  */
> +  unsigned node_within_scc : 1;
> +  /* Node is calling a private function called only once.  */
> +  unsigned node_calling_single_call : 1;
>  };
>  
>  /* ipa_node_params access functions.  Please use these to access fields that
> Index: ipa-cp.c
> ===================================================================
> --- ipa-cp.c  (revision 221757)
> +++ ipa-cp.c  (working copy)
> @@ -811,6 +811,41 @@ set_all_contains_variable (struct ipcp_p
>    return ret;
>  }
>  
> +/* Worker of call_for_symbol_thunks_and_aliases, increment the integer DATA
> +   points to by the number of callers to NODE.  */
> +
> +static bool
> +count_callers (cgraph_node *node, void *data)
> +{
> +  int *caller_count = (int *) data;
> +
> +  for (cgraph_edge *cs = node->callers; cs; cs = cs->next_caller)
> +    /* Local thunks can be handled transparently, but if the thunk can not
> +       be optimized out, count it as a real use.  */
> +    if (!cs->caller->thunk.thunk_p || !cs->caller->local.local)
> +      ++*caller_count;

...we don't recurse for local thunks and do not count their number of
callers here.  I'm not sure if it can ever happen in practice but if
we have a local function that is called multiple times but always
through one particular local thunk, we should end with caller_count
greater than one, and with this code we will not?  Such function
certainly should not be considered called just once by the inliner.

Similarly, if we ever have a local node with a non-local thunk (would
that be legal and make sense?), then we should try not ending up with
caller_count == 1, because inliner will also never consider that
function just called once.  So I'd suggest incrementing the caller by
two instead in that case.

But maybe I just don't understand what "handled transparently" in the
comment means.

Thanks for clearing this up,

Martin

Reply via email to