The patch was updated to ignore comdat einline tuning for AutoFDO.
Performance testing is green.
OK for google-4_9?
Thanks,
Dehao
Index: gcc/auto-profile.c
===================================================================
--- gcc/auto-profile.c (revision 217523)
+++ gcc/auto-profile.c (working copy)
@@ -1771,6 +1771,7 @@ auto_profile (void)
free_dominance_info (CDI_DOMINATORS);
free_dominance_info (CDI_POST_DOMINATORS);
rebuild_cgraph_edges ();
+ compute_inline_parameters (cgraph_get_node
(current_function_decl), true);
pop_cfun ();
}
Index: gcc/ipa-inline.c
===================================================================
--- gcc/ipa-inline.c (revision 217523)
+++ gcc/ipa-inline.c (working copy)
@@ -501,7 +501,7 @@ want_early_inline_function_p (struct cgraph_edge *
growth);
want_inline = false;
}
- else if (DECL_COMDAT (callee->decl)
+ else if (!flag_auto_profile && DECL_COMDAT (callee->decl)
&& growth <= PARAM_VALUE (PARAM_EARLY_INLINING_INSNS_COMDAT))
;
else if ((n = num_calls (callee)) != 0
On Thu, Nov 13, 2014 at 3:42 PM, Dehao Chen <[email protected]> wrote:
> We do not do sophisticated recursive call detection in einline phase.
> It only happens in ipa-inline phase.
>
> Dehao
>
> On Thu, Nov 13, 2014 at 3:18 PM, Xinliang David Li <[email protected]> wrote:
>> On Thu, Nov 13, 2014 at 2:57 PM, Dehao Chen <[email protected]> wrote:
>>> IIRC, AutoFDO the actual iteration for AutoFDO is mostly <3. But it
>>> should not harm to set max iter as 10.
>>>
>>> On Thu, Nov 13, 2014 at 2:51 PM, Xinliang David Li <[email protected]>
>>> wrote:
>>>> After inline summary is recomputed, the large code growth problem will
>>>> also be better controlled, right?
>>>
>>> For this case, recomputing inline summary does not help because the
>>> code was bloated in first einline phase.
>>
>> For recursive inlining, the inline summary for the cloned edges need
>> to be updated to prevent the growth?
>>
>> david
>>
>>>
>>> Dehao
>>>
>>>>
>>>> David
>>>>
>>>> On Thu, Nov 13, 2014 at 2:48 PM, Xinliang David Li <[email protected]>
>>>> wrote:
>>>>> Is there a need to have 10 iterations of early inline for autofdo?
>>>>>
>>>>> David
>>>>>
>>>>> On Thu, Nov 13, 2014 at 2:25 PM, Dehao Chen <[email protected]> wrote:
>>>>>> In AutoFDO, we increase einline iterations. This could lead to
>>>>>> extensive code bloat if we have recursive calls like:
>>>>>>
>>>>>> dtor() {
>>>>>> destroy(node);
>>>>>> }
>>>>>>
>>>>>> destroy(node) {
>>>>>> destroy(left)
>>>>>> destroy(right)
>>>>>> }
>>>>>>
>>>>>> In this case, the size growth will be around 8 which is smaller than
>>>>>> threshold (11). However, if we allow this to happen for 2 iterations,
>>>>>> it will expand the size by 1024X. To fix this problem, we want to set
>>>>>> a much smaller threshold in the AutoFDO case. This is because AutoFDO
>>>>>> do not not rely on aggressive einline to gain more profile context.
>>>>>>
>>>>>> And also, in AutoFDO pass, after we processed a function, we need to
>>>>>> recompute inline parameters because rebuild_cgraph_edges will zero out
>>>>>> all inline parameters.
>>>>>>
>>>>>> The patch is attached below, bootstrapped and perf test on-going. OK
>>>>>> for google-4_9?
>>>>>>
>>>>>> Thanks,
>>>>>> Dehao
>>>>>>
>>>>>> Index: gcc/auto-profile.c
>>>>>> ===================================================================
>>>>>> --- gcc/auto-profile.c (revision 217523)
>>>>>> +++ gcc/auto-profile.c (working copy)
>>>>>> @@ -1771,6 +1771,7 @@ auto_profile (void)
>>>>>> free_dominance_info (CDI_DOMINATORS);
>>>>>> free_dominance_info (CDI_POST_DOMINATORS);
>>>>>> rebuild_cgraph_edges ();
>>>>>> + compute_inline_parameters (cgraph_get_node
>>>>>> (current_function_decl), true);
>>>>>> pop_cfun ();
>>>>>> }
>>>>>>
>>>>>> Index: gcc/opts.c
>>>>>> ===================================================================
>>>>>> --- gcc/opts.c (revision 217523)
>>>>>> +++ gcc/opts.c (working copy)
>>>>>> @@ -1853,6 +1853,12 @@ common_handle_option (struct gcc_options *opts,
>>>>>> maybe_set_param_value (
>>>>>> PARAM_EARLY_INLINER_MAX_ITERATIONS, 10,
>>>>>> opts->x_param_values, opts_set->x_param_values);
>>>>>> + maybe_set_param_value (
>>>>>> + PARAM_EARLY_INLINING_INSNS, 4,
>>>>>> + opts->x_param_values, opts_set->x_param_values);
>>>>>> + maybe_set_param_value (
>>>>>> + PARAM_EARLY_INLINING_INSNS_COMDAT, 4,
>>>>>> + opts->x_param_values, opts_set->x_param_values);
>>>>>> value = true;
>>>>>> /* No break here - do -fauto-profile processing. */
>>>>>> case OPT_fauto_profile: