On Mon, Oct 14, 2013 at 3:04 PM, Xinliang David Li <davi...@google.com> wrote: > On Mon, Oct 14, 2013 at 2:34 PM, Dehao Chen <de...@google.com> wrote: >> For my test case, the entire inline instance is optimized away, > > do you mean there is no out of line instance for the target function > in the profile binary?
Yes, and there is no inline instance either. Dehao > > David > >> so >> there is no info about it in the profile. I can do some fixup in the >> rebuild_cgraph_edge though. >> >> Dehao >> >> On Mon, Oct 14, 2013 at 2:27 PM, Xinliang David Li <davi...@google.com> >> wrote: >>> Is it possible to update the callee node summary after profile >>> annotate (using information from inline instances which are not >>> inlined in early inline)? >>> >>> David >>> >>> On Mon, Oct 14, 2013 at 2:18 PM, Dehao Chen <de...@google.com> wrote: >>>> On Mon, Oct 14, 2013 at 12:49 PM, Jan Hubicka <hubi...@ucw.cz> wrote: >>>>>> Not for instrumented FDO (not as I know of). But for AutoFDO, this >>>>>> could be a potential risk because some callee is marked unlikely >>>>>> executed simply because they are inlined and eliminated in the O2 >>>>>> binary. But in ipa-inline it will not get inlined because the edge is >>>>>> not hot from cgraph_maybe_hot_edge_p (because callee is >>>>>> UNLIKELY_EXECUTED), while the edge->count is actually hot. >>>>> >>>>> Can't you prevent setting calle to UNLIKELY_EXECUTED in these cases >>>>> instead? >>>>> It seems that having profile set incorrectly will lead to other problems >>>>> later, too. >>>>> We discussed similar problem with Teresa about the missing profiles for >>>>> comdat, >>>>> basically one should detect these cases as profile being lost and go with >>>>> guessed >>>>> profile. (I believe patch for that was posted, too, and so far it seems >>>>> best approach >>>>> to this issue) >>>> >>>> The current AutoFDO implementation will take all functions that do not >>>> have have profile as normally executed, thus use guessed profile for >>>> it. This is like using profile for truly hot functions, and using O2 >>>> for other functions. This works fine. However, it leads to larger code >>>> size (approximately 10%~20% larger than FDO). >>>> >>>> I'd like to introduce another mode for users who care about both >>>> performance and code size, and can be sure that profile is >>>> representative. In this mode, we will mark all functions without >>>> sample as "unlikely executed". However, because AutoFDO use debug info >>>> (of optimized code) to represent profile, it's possible that some hot >>>> functions (say foo) are inlined and fully eliminated into another hot >>>> function (say bar). So in the profile, bar is cold, and because the >>>> profile for foo::bar is eliminated, bar will not be inlined into foo >>>> before the profile annotation. However, after profile annotate, we can >>>> infer from the bb count that foo->bar is hot, thus it should be >>>> inlined in ipa-inline phase. However, because bar itself is marked >>>> UNLIKELY_EXECUTED, it will not be inlined. >>>> >>>> One possible workaround would be that during rebuild_cgraph_edges, if >>>> we find an edge's callee is unlikely executed, add the edge count to >>>> the callee's count and recalculate callee's frequency. >>>> >>>> Dehao >>>> >>>>> >>>>> Honza