On Mon, Oct 14, 2013 at 3:04 PM, Xinliang David Li <davi...@google.com> wrote:
> On Mon, Oct 14, 2013 at 2:34 PM, Dehao Chen <de...@google.com> wrote:
>> For my test case, the entire inline instance is optimized away,
>
> do you mean there is no out of line instance for the target function
> in the profile binary?

Yes, and there is no inline instance either.

Dehao

>
> David
>
>> so
>> there is no info about it in the profile. I can do some fixup in the
>> rebuild_cgraph_edge though.
>>
>> Dehao
>>
>> On Mon, Oct 14, 2013 at 2:27 PM, Xinliang David Li <davi...@google.com> 
>> wrote:
>>> Is it possible to update the callee node summary after profile
>>> annotate (using information from inline instances which are not
>>> inlined in early inline)?
>>>
>>> David
>>>
>>> On Mon, Oct 14, 2013 at 2:18 PM, Dehao Chen <de...@google.com> wrote:
>>>> On Mon, Oct 14, 2013 at 12:49 PM, Jan Hubicka <hubi...@ucw.cz> wrote:
>>>>>> Not for instrumented FDO (not as I know of). But for AutoFDO, this
>>>>>> could be a potential risk because some callee is marked unlikely
>>>>>> executed simply because they are inlined and eliminated in the O2
>>>>>> binary. But in ipa-inline it will not get inlined because the edge is
>>>>>> not hot from cgraph_maybe_hot_edge_p (because callee is
>>>>>> UNLIKELY_EXECUTED), while the edge->count is actually hot.
>>>>>
>>>>> Can't you prevent setting calle to UNLIKELY_EXECUTED in these cases 
>>>>> instead?
>>>>> It seems that having profile set incorrectly will lead to other problems 
>>>>> later, too.
>>>>> We discussed similar problem with Teresa about the missing profiles for 
>>>>> comdat,
>>>>> basically one should detect these cases as profile being lost and go with 
>>>>> guessed
>>>>> profile.  (I believe patch for that was posted, too, and so far it seems 
>>>>> best approach
>>>>> to this issue)
>>>>
>>>> The current AutoFDO implementation will take all functions that do not
>>>> have have profile as normally executed, thus use guessed profile for
>>>> it. This is like using profile for truly hot functions, and using O2
>>>> for other functions. This works fine. However, it leads to larger code
>>>> size (approximately 10%~20% larger than FDO).
>>>>
>>>> I'd like to introduce another mode for users who care about both
>>>> performance and code size, and can be sure that profile is
>>>> representative. In this mode, we will mark all functions without
>>>> sample as "unlikely executed". However, because AutoFDO use debug info
>>>> (of optimized code) to represent profile, it's possible that some hot
>>>> functions (say foo) are inlined and fully eliminated into another hot
>>>> function (say bar). So in the profile, bar is cold, and because the
>>>> profile for foo::bar is eliminated, bar will not be inlined into foo
>>>> before the profile annotation. However, after profile annotate, we can
>>>> infer from the bb count that foo->bar is hot, thus it should be
>>>> inlined in ipa-inline phase. However, because bar itself is marked
>>>> UNLIKELY_EXECUTED, it will not be inlined.
>>>>
>>>> One possible workaround would be that during rebuild_cgraph_edges, if
>>>> we find an edge's callee is unlikely executed, add the edge count to
>>>> the callee's count and recalculate callee's frequency.
>>>>
>>>> Dehao
>>>>
>>>>>
>>>>> Honza

Reply via email to