Hi,
this pass removes early-inlining from afdo pass since all inlining
should now happen from early inliner.  I tedted this on spec and there
are 3 inlines happening here which are blocked at early-inline time by
hitting large function growth limit.  We probably want to bypass that
limit, I will look into that incrementaly.

This should make the non-inlined function profile merging hopefully
easier.

It may still make sense to separate afdo inliner from early inliner to
solve the non-transitivity issues which is not that hard to do with
current code orgnaization. However this should be separate IPA pass
rather then another part of afdo pass, since it can be coneptually
separate.

Boostrapped/regtested x86_64-linux, will commit it shortly.

Honza

gcc/ChangeLog:

        * auto-profile.cc: Update toplevel comment.
        (early_inline): Remove.
        (auto_profile): Don't do early inlining.

diff --git a/gcc/auto-profile.cc b/gcc/auto-profile.cc
index 8a1d9f878c6..3f8310e6324 100644
--- a/gcc/auto-profile.cc
+++ b/gcc/auto-profile.cc
@@ -76,21 +76,30 @@ along with GCC; see the file COPYING3.  If not see
      standalone symbol, or a clone of a function that is inlined into another
      function.
 
-   Phase 2: Early inline + value profile transformation.
-     Early inline uses autofdo_source_profile to find if a callsite is:
+   Phase 2: AFDO inline + value profile transformation.
+     This happens during early optimization.
+     During early inlning AFDO inliner is executed which
+     uses autofdo_source_profile to find if a callsite is:
         * inlined in the profiled binary.
         * callee body is hot in the profiling run.
      If both condition satisfies, early inline will inline the callsite
      regardless of the code growth.
-     Phase 2 is an iterative process. During each iteration, we also check
-     if an indirect callsite is promoted and inlined in the profiling run.
-     If yes, vpt will happen to force promote it and in the next iteration,
-     einline will inline the promoted callsite in the next iteration.
+
+     Performing this early has benefit of doing early optimizations
+     before read IPA passe and getting more "context sensitivity" of
+     the profile read.  Profile of inlined functions may differ
+     significantly form one inline instance to another and from the
+     offline version.
+
+     This is controlled by -fauto-profile-inlinig and is independent
+     of -fearly-inlining.
 
    Phase 3: Annotate control flow graph.
      AutoFDO uses a separate pass to:
         * Annotate basic block count
         * Estimate branch probability
+       * Use earlier static profile to fill in the gaps
+         if AFDO profile is ambigous
 
    After the above 3 phases, all profile is readily annotated on the GCC IR.
    AutoFDO tries to reuse all FDO infrastructure as much as possible to make
@@ -2217,18 +2226,6 @@ afdo_annotate_cfg (void)
   free_dominance_info (CDI_POST_DOMINATORS);
 }
 
-/* Wrapper function to invoke early inliner.  */
-
-static unsigned int
-early_inline ()
-{
-  compute_fn_summary (cgraph_node::get (current_function_decl), true);
-  unsigned int todo = early_inliner (cfun);
-  if (todo & TODO_update_ssa_any)
-    update_ssa (TODO_update_ssa);
-  return todo;
-}
-
 /* Use AutoFDO profile to annoate the control flow graph.
    Return the todo flag.  */
 
@@ -2254,15 +2251,9 @@ auto_profile (void)
 
     push_cfun (DECL_STRUCT_FUNCTION (node->decl));
 
-    unsigned int todo = early_inline ();
     autofdo::afdo_annotate_cfg ();
     compute_function_frequency ();
 
-    /* Local pure-const may imply need to fixup the cfg.  */
-    todo |= execute_fixup_cfg ();
-    if (todo & TODO_cleanup_cfg)
-      cleanup_tree_cfg ();
-
     free_dominance_info (CDI_DOMINATORS);
     free_dominance_info (CDI_POST_DOMINATORS);
     cgraph_edge::rebuild_edges ();

Reply via email to