On Thu, Apr 14, 2011 at 12:20 AM, Jan Hubicka <hubi...@ucw.cz> wrote:
> Hi,
> this patch moves inline_summary from field in cgraph_node into its own on side
> datastructure. This moves it from arcane decision of mine to split all IPA 
> data
> into global/local datas stored in common datastructure into the scheme we
> developed for new IPA passes some time ago.
>
> The advantage is that the code is more contained and less spread across the
> compiler. We also make cgraph_node smaller and dumps more compact that never
> hurts.
>
> While working on it I noticed that Richi's patch to introduce cgraph_edge
> times/sizes is bit iffy in computing data when they are missing in the
> datastructure. Also it computes incomming edge costs instead of outgoing that
> leads to fact that not all edges gets their info computed for IPA inliner
> (think of newly discovered direct calls or IPA merging).

Ah, that was the reason ... I didn't dig deep enough ... ;)

>
> I fixed this on the and added sanity check that the fields are initialized.
> This has shown problem with early inliner iteration fixed thusly and fact that
> early inliner is attempting to compute overall growth at a time the inline
> parameters are not computed for functions not visited by early optimizations
> yet. We previously agreed that early inliner should not try to do that (as 
> this
> leads to early inliner inlining functions called once that should be deferred
> for later consieration).  I just hope it won't cause benchmarks to
> regress too much ;)

Yeah, we agreed to that.  And I forgot about it as it wasn't part of the
early inliner reorg (which was supposed to be a 1:1 transform).

>
> Having place to pile inline analysis info in, there is more to cleanup. The
> cgraph_local/cgraph_global fields probably should go and the stuff from global
> info should go into inline_summary datastructure, too (the lifetimes are
> essentially the same so there is no need for the split).  I will handle this
> incrementally.
>
> Bootstrapped/regtested x86_64-linux with slightly modified version of the 
> patch.
> Re-testing with final version and intend to commit the patch tomorrow.

I looked over the patch and it looks ok to me.

Thanks,
Richard.

> Honza
>
>        * cgraph.c (dump_cgraph_node): Do not dump inline summaries.
>        * cgraph.h (struct inline_summary): Move to ipa-inline.h
>        (cgraph_local_info): Remove inline_summary.
>        * ipa-cp.c: Include ipa-inline.h.
>        (ipcp_cloning_candidate_p, ipcp_estimate_growth,
>        ipcp_estimate_cloning_cost, ipcp_insert_stage): Use inline_summary
>        accesor.
>        * lto-cgraph.c (lto_output_node): Do not stream inline summary.
>        (input_overwrite_node): Do not set inline summary.
>        (input_node): Do not stream inline summary.
>        * ipa-inline.c (cgraph_decide_inlining): Dump inline summaries.
>        (cgraph_decide_inlining_incrementally): Do not try to estimate overall
>        growth; we do not have inline parameters computed for that anyway.
>        (cgraph_early_inlining): After inlining compute call_stmt_sizes.
>        * ipa-inline.h (struct inline_summary): Move here from ipa-inline.h
>        (inline_summary_t): New type and VECtor.
>        (debug_inline_summary, dump_inline_summaries): Declare.
>        (inline_summary): Use VOCtor.
>        (estimate_edge_growth): Kill hack computing call stmt size directly.
>        * lto-section-in.c (lto_section_name): Add inline section.
>        * ipa-inline-analysis.c: Include lto-streamer.h
>        (node_removal_hook_holder, node_duplication_hook_holder): New holders
>        (inline_node_removal_hook, inline_node_duplication_hook): New 
> functions.
>        (inline_summary_vec): Define.
>        (inline_summary_alloc, dump_inline_summary, debug_inline_summary,
>        dump_inline_summaries): New functions.
>        (estimate_function_body_sizes): Properly compute size/time of outgoing 
> calls.
>        (compute_inline_parameters): Alloc inline_summary; do not compute 
> size/time
>        of incomming calls.
>        (estimate_edge_time): Avoid missing time summary hack.
>        (inline_read_summary): Read inline summary info.
>        (inline_write_summary): Write inline summary info.
>        (inline_free_summary): Free all hooks and inline summary vector.
>        * lto-streamer.h: Add LTO_section_inline_summary section.
>        * Makefile.in (ipa-cp.o, ipa-inline-analysis.o): Update dependencies.
>        * ipa.c (cgraph_remove_unreachable_nodes): Fix dump file formating.
>
>        * lto.c: Include ipa-inline.h
>        (add_cgraph_node_to_partition, undo_partition): Use inline_summary 
> accessor.
>        (ipa_node_duplication_hook): Fix declaration.
>        * Make-lang.in (lto.o): Update dependencies.
> Index: cgraph.c
> ===================================================================
> --- cgraph.c    (revision 172396)
> +++ cgraph.c    (working copy)
> @@ -1876,22 +1876,6 @@ dump_cgraph_node (FILE *f, struct cgraph
>   if (node->count)
>     fprintf (f, " executed "HOST_WIDEST_INT_PRINT_DEC"x",
>             (HOST_WIDEST_INT)node->count);
> -  if (node->local.inline_summary.self_time)
> -    fprintf (f, " %i time, %i benefit", node->local.inline_summary.self_time,
> -                                       
> node->local.inline_summary.time_inlining_benefit);
> -  if (node->global.time && node->global.time
> -      != node->local.inline_summary.self_time)
> -    fprintf (f, " (%i after inlining)", node->global.time);
> -  if (node->local.inline_summary.self_size)
> -    fprintf (f, " %i size, %i benefit", node->local.inline_summary.self_size,
> -                                       
> node->local.inline_summary.size_inlining_benefit);
> -  if (node->global.size && node->global.size
> -      != node->local.inline_summary.self_size)
> -    fprintf (f, " (%i after inlining)", node->global.size);
> -  if (node->local.inline_summary.estimated_self_stack_size)
> -    fprintf (f, " %i bytes stack usage", 
> (int)node->local.inline_summary.estimated_self_stack_size);
> -  if (node->global.estimated_stack_size != 
> node->local.inline_summary.estimated_self_stack_size)
> -    fprintf (f, " %i bytes after inlining", 
> (int)node->global.estimated_stack_size);
>   if (node->origin)
>     fprintf (f, " nested in: %s", cgraph_node_name (node->origin));
>   if (node->needed)
> Index: cgraph.h
> ===================================================================
> --- cgraph.h    (revision 172396)
> +++ cgraph.h    (working copy)
> @@ -58,23 +58,6 @@ struct lto_file_decl_data;
>  extern const char * const cgraph_availability_names[];
>  extern const char * const ld_plugin_symbol_resolution_names[];
>
> -/* Function inlining information.  */
> -
> -struct GTY(()) inline_summary
> -{
> -  /* Estimated stack frame consumption by the function.  */
> -  HOST_WIDE_INT estimated_self_stack_size;
> -
> -  /* Size of the function body.  */
> -  int self_size;
> -  /* How many instructions are likely going to disappear after inlining.  */
> -  int size_inlining_benefit;
> -  /* Estimated time spent executing the function body.  */
> -  int self_time;
> -  /* How much time is going to be saved by inlining.  */
> -  int time_inlining_benefit;
> -};
> -
>  /* Information about thunk, used only for same body aliases.  */
>
>  struct GTY(()) cgraph_thunk_info {
> @@ -95,8 +78,6 @@ struct GTY(()) cgraph_local_info {
>   /* File stream where this node is being written to.  */
>   struct lto_file_decl_data * lto_file_data;
>
> -  struct inline_summary inline_summary;
> -
>   /* Set when function function is visible in current compilation unit only
>      and its address is never taken.  */
>   unsigned local : 1;
> Index: ipa-cp.c
> ===================================================================
> --- ipa-cp.c    (revision 172396)
> +++ ipa-cp.c    (working copy)
> @@ -148,6 +148,7 @@ along with GCC; see the file COPYING3.
>  #include "tree-inline.h"
>  #include "fibheap.h"
>  #include "params.h"
> +#include "ipa-inline.h"
>
>  /* Number of functions identified as candidates for cloning. When not cloning
>    we can simplify iterate stage not forcing it to go through the decision
> @@ -495,7 +496,7 @@ ipcp_cloning_candidate_p (struct cgraph_
>                 cgraph_node_name (node));
>       return false;
>     }
> -  if (node->local.inline_summary.self_size < n_calls)
> +  if (inline_summary (node)->self_size < n_calls)
>     {
>       if (dump_file)
>         fprintf (dump_file, "Considering %s for cloning; code would 
> shrink.\n",
> @@ -1189,7 +1190,7 @@ ipcp_estimate_growth (struct cgraph_node
>      call site.  Precise cost is difficult to get, as our size metric counts
>      constants and moves as free.  Generally we are looking for cases that
>      small function is called very many times.  */
> -  growth = node->local.inline_summary.self_size
> +  growth = inline_summary (node)->self_size
>           - removable_args * redirectable_node_callers;
>   if (growth < 0)
>     return 0;
> @@ -1229,7 +1230,7 @@ ipcp_estimate_cloning_cost (struct cgrap
>     cost /= freq_sum * 1000 / REG_BR_PROB_BASE + 1;
>   if (dump_file)
>     fprintf (dump_file, "Cost of versioning %s is %i, (size: %i, freq: %i)\n",
> -             cgraph_node_name (node), cost, 
> node->local.inline_summary.self_size,
> +             cgraph_node_name (node), cost, inline_summary (node)->self_size,
>             freq_sum);
>   return cost + 1;
>  }
> @@ -1364,7 +1365,7 @@ ipcp_insert_stage (void)
>       {
>        if (node->count > max_count)
>          max_count = node->count;
> -       overall_size += node->local.inline_summary.self_size;
> +       overall_size += inline_summary (node)->self_size;
>       }
>
>   max_new_size = overall_size;
> Index: lto-cgraph.c
> ===================================================================
> --- lto-cgraph.c        (revision 172396)
> +++ lto-cgraph.c        (working copy)
> @@ -465,16 +465,6 @@ lto_output_node (struct lto_simple_outpu
>
>   if (tag == LTO_cgraph_analyzed_node)
>     {
> -      lto_output_sleb128_stream (ob->main_stream,
> -                                
> node->local.inline_summary.estimated_self_stack_size);
> -      lto_output_sleb128_stream (ob->main_stream,
> -                                node->local.inline_summary.self_size);
> -      lto_output_sleb128_stream (ob->main_stream,
> -                                
> node->local.inline_summary.size_inlining_benefit);
> -      lto_output_sleb128_stream (ob->main_stream,
> -                                node->local.inline_summary.self_time);
> -      lto_output_sleb128_stream (ob->main_stream,
> -                                
> node->local.inline_summary.time_inlining_benefit);
>       if (node->global.inlined_to)
>        {
>          ref = lto_cgraph_encoder_lookup (encoder, node->global.inlined_to);
> @@ -930,23 +920,9 @@ input_overwrite_node (struct lto_file_de
>                      struct cgraph_node *node,
>                      enum LTO_cgraph_tags tag,
>                      struct bitpack_d *bp,
> -                     unsigned int stack_size,
> -                     unsigned int self_time,
> -                     unsigned int time_inlining_benefit,
> -                     unsigned int self_size,
> -                     unsigned int size_inlining_benefit,
>                      enum ld_plugin_symbol_resolution resolution)
>  {
>   node->aux = (void *) tag;
> -  node->local.inline_summary.estimated_self_stack_size = stack_size;
> -  node->local.inline_summary.self_time = self_time;
> -  node->local.inline_summary.time_inlining_benefit = time_inlining_benefit;
> -  node->local.inline_summary.self_size = self_size;
> -  node->local.inline_summary.size_inlining_benefit = size_inlining_benefit;
> -  node->global.time = self_time;
> -  node->global.size = self_size;
> -  node->global.estimated_stack_size = stack_size;
> -  node->global.estimated_growth = INT_MIN;
>   node->local.lto_file_data = file_data;
>
>   node->local.local = bp_unpack_value (bp, 1);
> @@ -1023,13 +999,8 @@ input_node (struct lto_file_decl_data *f
>   tree fn_decl;
>   struct cgraph_node *node;
>   struct bitpack_d bp;
> -  int stack_size = 0;
>   unsigned decl_index;
>   int ref = LCC_NOT_FOUND, ref2 = LCC_NOT_FOUND;
> -  int self_time = 0;
> -  int self_size = 0;
> -  int time_inlining_benefit = 0;
> -  int size_inlining_benefit = 0;
>   unsigned long same_body_count = 0;
>   int clone_ref;
>   enum ld_plugin_symbol_resolution resolution;
> @@ -1051,15 +1022,7 @@ input_node (struct lto_file_decl_data *f
>   node->count_materialization_scale = lto_input_sleb128 (ib);
>
>   if (tag == LTO_cgraph_analyzed_node)
> -    {
> -      stack_size = lto_input_sleb128 (ib);
> -      self_size = lto_input_sleb128 (ib);
> -      size_inlining_benefit = lto_input_sleb128 (ib);
> -      self_time = lto_input_sleb128 (ib);
> -      time_inlining_benefit = lto_input_sleb128 (ib);
> -
> -      ref = lto_input_sleb128 (ib);
> -    }
> +    ref = lto_input_sleb128 (ib);
>
>   ref2 = lto_input_sleb128 (ib);
>
> @@ -1073,9 +1036,7 @@ input_node (struct lto_file_decl_data *f
>
>   bp = lto_input_bitpack (ib);
>   resolution = (enum ld_plugin_symbol_resolution)lto_input_uleb128 (ib);
> -  input_overwrite_node (file_data, node, tag, &bp, stack_size, self_time,
> -                       time_inlining_benefit, self_size,
> -                       size_inlining_benefit, resolution);
> +  input_overwrite_node (file_data, node, tag, &bp, resolution);
>
>   /* Store a reference for now, and fix up later to be a pointer.  */
>   node->global.inlined_to = (cgraph_node_ptr) (intptr_t) ref;
> Index: ipa-inline.c
> ===================================================================
> --- ipa-inline.c        (revision 172396)
> +++ ipa-inline.c        (working copy)
> @@ -1301,6 +1301,9 @@ cgraph_decide_inlining (void)
>              max_benefit = benefit;
>          }
>       }
> +
> +  if (dump_file)
> +    dump_inline_summaries (dump_file);
>   gcc_assert (in_lto_p
>              || !max_count
>              || (profile_info && flag_branch_probabilities));
> @@ -1558,8 +1561,7 @@ cgraph_decide_inlining_incrementally (st
>       /* When the function body would grow and inlining the function
>         won't eliminate the need for offline copy of the function,
>         don't inline.  */
> -      if (estimate_edge_growth (e) > allowed_growth
> -         && estimate_growth (e->callee) > allowed_growth)
> +      if (estimate_edge_growth (e) > allowed_growth)
>        {
>          if (dump_file)
>            fprintf (dump_file,
> @@ -1601,6 +1603,7 @@ static unsigned int
>  cgraph_early_inlining (void)
>  {
>   struct cgraph_node *node = cgraph_get_node (current_function_decl);
> +  struct cgraph_edge *edge;
>   unsigned int todo = 0;
>   int iterations = 0;
>   bool inlined = false;
> @@ -1652,6 +1655,19 @@ cgraph_early_inlining (void)
>     {
>       timevar_push (TV_INTEGRATION);
>       todo |= optimize_inline_calls (current_function_decl);
> +
> +      /* Technically we ought to recompute inline parameters so the new 
> iteration of
> +        early inliner works as expected.  We however have values 
> approximately right
> +        and thus we only need to update edge info that might be cleared out 
> for
> +        newly discovered edges.  */
> +      for (edge = node->callees; edge; edge = edge->next_callee)
> +       {
> +         edge->call_stmt_size
> +           = estimate_num_insns (edge->call_stmt, &eni_size_weights);
> +         edge->call_stmt_time
> +           = estimate_num_insns (edge->call_stmt, &eni_time_weights);
> +       }
> +
>       timevar_pop (TV_INTEGRATION);
>     }
>
> Index: ipa-inline.h
> ===================================================================
> --- ipa-inline.h        (revision 172396)
> +++ ipa-inline.h        (working copy)
> @@ -19,6 +19,30 @@ You should have received a copy of the G
>  along with GCC; see the file COPYING3.  If not see
>  <http://www.gnu.org/licenses/>.  */
>
> +/* Function inlining information.  */
> +
> +struct inline_summary
> +{
> +  /* Estimated stack frame consumption by the function.  */
> +  HOST_WIDE_INT estimated_self_stack_size;
> +
> +  /* Size of the function body.  */
> +  int self_size;
> +  /* How many instructions are likely going to disappear after inlining.  */
> +  int size_inlining_benefit;
> +  /* Estimated time spent executing the function body.  */
> +  int self_time;
> +  /* How much time is going to be saved by inlining.  */
> +  int time_inlining_benefit;
> +};
> +
> +typedef struct inline_summary inline_summary_t;
> +DEF_VEC_O(inline_summary_t);
> +DEF_VEC_ALLOC_O(inline_summary_t,heap);
> +extern VEC(inline_summary_t,heap) *inline_summary_vec;
> +
> +void debug_inline_summary (struct cgraph_node *);
> +void dump_inline_summaries (FILE *f);
>  void inline_generate_summary (void);
>  void inline_read_summary (void);
>  void inline_write_summary (cgraph_node_set, varpool_node_set);
> @@ -30,7 +54,7 @@ int estimate_growth (struct cgraph_node
>  static inline struct inline_summary *
>  inline_summary (struct cgraph_node *node)
>  {
> -  return &node->local.inline_summary;
> +  return VEC_index (inline_summary_t, inline_summary_vec, node->uid);
>  }
>
>  /* Estimate the growth of the caller when inlining EDGE.  */
> @@ -39,12 +63,8 @@ static inline int
>  estimate_edge_growth (struct cgraph_edge *edge)
>  {
>   int call_stmt_size;
> -  /* ???  We throw away cgraph edges all the time so the information
> -     we store in edges doesn't persist for early inlining.  Ugh.  */
> -  if (!edge->call_stmt)
> -    call_stmt_size = edge->call_stmt_size;
> -  else
> -    call_stmt_size = estimate_num_insns (edge->call_stmt, &eni_size_weights);
> +  call_stmt_size = edge->call_stmt_size;
> +  gcc_checking_assert (call_stmt_size);
>   return (edge->callee->global.size
>          - inline_summary (edge->callee)->size_inlining_benefit
>          - call_stmt_size);
> Index: lto-section-in.c
> ===================================================================
> --- lto-section-in.c    (revision 172396)
> +++ lto-section-in.c    (working copy)
> @@ -58,7 +58,8 @@ const char *lto_section_name[LTO_N_SECTI
>   "reference",
>   "symtab",
>   "opts",
> -  "cgraphopt"
> +  "cgraphopt",
> +  "inline"
>  };
>
>  unsigned char
> Index: ipa.c
> ===================================================================
> --- ipa.c       (revision 172396)
> +++ ipa.c       (working copy)
> @@ -517,6 +517,8 @@ cgraph_remove_unreachable_nodes (bool be
>              }
>          }
>       }
> +  if (file)
> +    fprintf (file, "\n");
>
>  #ifdef ENABLE_CHECKING
>   verify_cgraph ();
> Index: ipa-inline-analysis.c
> ===================================================================
> --- ipa-inline-analysis.c       (revision 172396)
> +++ ipa-inline-analysis.c       (working copy)
> @@ -23,13 +23,13 @@ along with GCC; see the file COPYING3.
>
>    We estimate for each function
>      - function body size
> -     - function runtime
> +     - average function execution time
>      - inlining size benefit (that is how much of function body size
>        and its call sequence is expected to disappear by inlining)
>      - inlining time benefit
>      - function frame size
>    For each call
> -     - call sequence size
> +     - call statement size and time
>
>    inlinie_summary datastructures store above information locally (i.e.
>    parameters of the function itself) and globally (i.e. parameters of
> @@ -61,12 +61,100 @@ along with GCC; see the file COPYING3.
>  #include "ggc.h"
>  #include "tree-flow.h"
>  #include "ipa-prop.h"
> +#include "lto-streamer.h"
>  #include "ipa-inline.h"
>
>  #define MAX_TIME 1000000000
>
>  /* Holders of ipa cgraph hooks: */
>  static struct cgraph_node_hook_list *function_insertion_hook_holder;
> +static struct cgraph_node_hook_list *node_removal_hook_holder;
> +static struct cgraph_2node_hook_list *node_duplication_hook_holder;
> +static void inline_node_removal_hook (struct cgraph_node *, void *);
> +static void inline_node_duplication_hook (struct cgraph_node *,
> +                                         struct cgraph_node *, void *);
> +
> +/* VECtor holding inline summaries.  */
> +VEC(inline_summary_t,heap) *inline_summary_vec;
> +
> +/* Allocate the inline summary vector or resize it to cover all cgraph 
> nodes. */
> +
> +static void
> +inline_summary_alloc (void)
> +{
> +  if (!node_removal_hook_holder)
> +    node_removal_hook_holder =
> +      cgraph_add_node_removal_hook (&inline_node_removal_hook, NULL);
> +  if (!node_duplication_hook_holder)
> +    node_duplication_hook_holder =
> +      cgraph_add_node_duplication_hook (&inline_node_duplication_hook, NULL);
> +
> +  if (VEC_length (inline_summary_t, inline_summary_vec)
> +      <= (unsigned) cgraph_max_uid)
> +    VEC_safe_grow_cleared (inline_summary_t, heap,
> +                          inline_summary_vec, cgraph_max_uid + 1);
> +}
> +
> +/* Hook that is called by cgraph.c when a node is removed.  */
> +
> +static void
> +inline_node_removal_hook (struct cgraph_node *node, void *data 
> ATTRIBUTE_UNUSED)
> +{
> +  /* During IPA-CP updating we can be called on not-yet analyze clones.  */
> +  if (VEC_length (inline_summary_t, inline_summary_vec)
> +      <= (unsigned)node->uid)
> +    return;
> +  memset (inline_summary (node),
> +         0, sizeof (inline_summary_t));
> +}
> +
> +/* Hook that is called by cgraph.c when a node is duplicated.  */
> +
> +static void
> +inline_node_duplication_hook (struct cgraph_node *src, struct cgraph_node 
> *dst,
> +                             ATTRIBUTE_UNUSED void *data)
> +{
> +  inline_summary_alloc ();
> +  memcpy (inline_summary (dst), inline_summary (src),
> +         sizeof (struct inline_summary));
> +}
> +
> +static void
> +dump_inline_summary (FILE *f, struct cgraph_node *node)
> +{
> +  if (node->analyzed)
> +    {
> +      struct inline_summary *s = inline_summary (node);
> +      fprintf (f, "Inline summary for %s/%i\n", cgraph_node_name (node),
> +              node->uid);
> +      fprintf (f, "  self time:       %i, benefit: %i\n",
> +              s->self_time, s->time_inlining_benefit);
> +      fprintf (f, "  global time:     %i\n", node->global.time);
> +      fprintf (f, "  self size:       %i, benefit: %i\n",
> +              s->self_size, s->size_inlining_benefit);
> +      fprintf (f, "  global size:     %i", node->global.size);
> +      fprintf (f, "  self stack:      %i\n",
> +              (int)s->estimated_self_stack_size);
> +      fprintf (f, "  global stack:    %i\n",
> +              (int)node->global.estimated_stack_size);
> +    }
> +}
> +
> +void
> +debug_inline_summary (struct cgraph_node *node)
> +{
> +  dump_inline_summary (stderr, node);
> +}
> +
> +void
> +dump_inline_summaries (FILE *f)
> +{
> +  struct cgraph_node *node;
> +
> +  for (node = cgraph_nodes; node; node = node->next)
> +    if (node->analyzed)
> +      dump_inline_summary (f, node);
> +}
>
>  /* See if statement might disappear after inlining.
>    0 - means not eliminated
> @@ -179,16 +267,27 @@ estimate_function_body_sizes (struct cgr
>                       freq, this_size, this_time);
>              print_gimple_stmt (dump_file, stmt, 0, 0);
>            }
> +
> +         if (is_gimple_call (stmt))
> +           {
> +             struct cgraph_edge *edge = cgraph_edge (node, stmt);
> +             edge->call_stmt_size = this_size;
> +             edge->call_stmt_time = this_time;
> +           }
> +
>          this_time *= freq;
>          time += this_time;
>          size += this_size;
> +
>          prob = eliminated_by_inlining_prob (stmt);
>          if (prob == 1 && dump_file && (dump_flags & TDF_DETAILS))
>            fprintf (dump_file, "    50%% will be eliminated by inlining\n");
>          if (prob == 2 && dump_file && (dump_flags & TDF_DETAILS))
>            fprintf (dump_file, "    will eliminated by inlining\n");
> +
>          size_inlining_benefit += this_size * prob;
>          time_inlining_benefit += this_time * prob;
> +
>          gcc_assert (time >= 0);
>          gcc_assert (size >= 0);
>        }
> @@ -222,6 +321,8 @@ compute_inline_parameters (struct cgraph
>
>   gcc_assert (!node->global.inlined_to);
>
> +  inline_summary_alloc ();
> +
>   /* Estimate the stack size for the function if we're optimizing.  */
>   self_stack_size = optimize ? estimated_stack_frame_size (node) : 0;
>   inline_summary (node)->estimated_self_stack_size = self_stack_size;
> @@ -247,17 +348,7 @@ compute_inline_parameters (struct cgraph
>       node->local.can_change_signature = !e;
>     }
>   estimate_function_body_sizes (node);
> -  /* Compute size of call statements.  We have to do this for callers here,
> -     those sizes need to be present for edges _to_ us as early as
> -     we are finished with early opts.  */
> -  for (e = node->callers; e; e = e->next_caller)
> -    if (e->call_stmt)
> -      {
> -       e->call_stmt_size
> -         = estimate_num_insns (e->call_stmt, &eni_size_weights);
> -       e->call_stmt_time
> -         = estimate_num_insns (e->call_stmt, &eni_time_weights);
> -      }
> +
>   /* Inlining characteristics are maintained by the cgraph_mark_inline.  */
>   node->global.time = inline_summary (node)->self_time;
>   node->global.size = inline_summary (node)->self_size;
> @@ -300,12 +391,8 @@ static inline int
>  estimate_edge_time (struct cgraph_edge *edge)
>  {
>   int call_stmt_time;
> -  /* ???  We throw away cgraph edges all the time so the information
> -     we store in edges doesn't persist for early inlining.  Ugh.  */
> -  if (!edge->call_stmt)
> -    call_stmt_time = edge->call_stmt_time;
> -  else
> -    call_stmt_time = estimate_num_insns (edge->call_stmt, &eni_time_weights);
> +  call_stmt_time = edge->call_stmt_time;
> +  gcc_checking_assert (call_stmt_time);
>   return (((gcov_type)edge->callee->global.time
>           - inline_summary (edge->callee)->time_inlining_benefit
>           - call_stmt_time) * edge->frequency
> @@ -379,8 +466,10 @@ estimate_growth (struct cgraph_node *nod
>   return growth;
>  }
>
> +
>  /* This function performs intraprocedural analysis in NODE that is required 
> to
>    inline indirect calls.  */
> +
>  static void
>  inline_indirect_intraprocedural_analysis (struct cgraph_node *node)
>  {
> @@ -437,8 +526,6 @@ inline_generate_summary (void)
>   for (node = cgraph_nodes; node; node = node->next)
>     if (node->analyzed)
>       inline_analyze_function (node);
> -
> -  return;
>  }
>
>
> @@ -449,6 +536,57 @@ inline_generate_summary (void)
>  void
>  inline_read_summary (void)
>  {
> +  struct lto_file_decl_data **file_data_vec = lto_get_file_decl_data ();
> +  struct lto_file_decl_data *file_data;
> +  unsigned int j = 0;
> +
> +  inline_summary_alloc ();
> +
> +  while ((file_data = file_data_vec[j++]))
> +    {
> +      size_t len;
> +      const char *data = lto_get_section_data (file_data, 
> LTO_section_inline_summary, NULL, &len);
> +
> +      struct lto_input_block *ib
> +       = lto_create_simple_input_block (file_data,
> +                                        LTO_section_inline_summary,
> +                                        &data, &len);
> +      if (ib)
> +       {
> +         unsigned int i;
> +         unsigned int f_count = lto_input_uleb128 (ib);
> +
> +         for (i = 0; i < f_count; i++)
> +           {
> +             unsigned int index;
> +             struct cgraph_node *node;
> +             struct inline_summary *info;
> +             lto_cgraph_encoder_t encoder;
> +
> +             index = lto_input_uleb128 (ib);
> +             encoder = file_data->cgraph_node_encoder;
> +             node = lto_cgraph_encoder_deref (encoder, index);
> +             info = inline_summary (node);
> +
> +             node->global.estimated_stack_size
> +               = info->estimated_self_stack_size = lto_input_uleb128 (ib);
> +             node->global.time = info->self_time = lto_input_uleb128 (ib);
> +             info->time_inlining_benefit = lto_input_uleb128 (ib);
> +             node->global.size = info->self_size = lto_input_uleb128 (ib);
> +             info->size_inlining_benefit = lto_input_uleb128 (ib);
> +             node->global.estimated_growth = INT_MIN;
> +           }
> +
> +         lto_destroy_simple_input_block (file_data,
> +                                         LTO_section_inline_summary,
> +                                         ib, data, len);
> +       }
> +      else
> +       /* Fatal error here.  We do not want to support compiling ltrans 
> units with
> +          different version of compiler or different flags than the WPA 
> unit, so
> +          this should never happen.  */
> +       fatal_error ("ipa reference summary is missing in ltrans unit");
> +    }
>   if (flag_indirect_inlining)
>     {
>       ipa_register_cgraph_hooks ();
> @@ -468,14 +606,57 @@ void
>  inline_write_summary (cgraph_node_set set,
>                      varpool_node_set vset ATTRIBUTE_UNUSED)
>  {
> +  struct cgraph_node *node;
> +  struct lto_simple_output_block *ob
> +    = lto_create_simple_output_block (LTO_section_inline_summary);
> +  lto_cgraph_encoder_t encoder = ob->decl_state->cgraph_node_encoder;
> +  unsigned int count = 0;
> +  int i;
> +
> +  for (i = 0; i < lto_cgraph_encoder_size (encoder); i++)
> +    if (lto_cgraph_encoder_deref (encoder, i)->analyzed)
> +      count++;
> +  lto_output_uleb128_stream (ob->main_stream, count);
> +
> +  for (i = 0; i < lto_cgraph_encoder_size (encoder); i++)
> +    {
> +      node = lto_cgraph_encoder_deref (encoder, i);
> +      if (node->analyzed)
> +       {
> +         struct inline_summary *info = inline_summary (node);
> +         lto_output_uleb128_stream (ob->main_stream,
> +                                    lto_cgraph_encoder_encode (encoder, 
> node));
> +         lto_output_sleb128_stream (ob->main_stream,
> +                                    info->estimated_self_stack_size);
> +         lto_output_sleb128_stream (ob->main_stream,
> +                                    info->self_size);
> +         lto_output_sleb128_stream (ob->main_stream,
> +                                    info->size_inlining_benefit);
> +         lto_output_sleb128_stream (ob->main_stream,
> +                                    info->self_time);
> +         lto_output_sleb128_stream (ob->main_stream,
> +                                    info->time_inlining_benefit);
> +       }
> +    }
> +
>   if (flag_indirect_inlining && !flag_ipa_cp)
>     ipa_prop_write_jump_functions (set);
>  }
>
> +
>  /* Release inline summary.  */
>
>  void
>  inline_free_summary (void)
>  {
> -  cgraph_remove_function_insertion_hook (function_insertion_hook_holder);
> +  if (function_insertion_hook_holder)
> +    cgraph_remove_function_insertion_hook (function_insertion_hook_holder);
> +  function_insertion_hook_holder = NULL;
> +  if (node_removal_hook_holder)
> +    cgraph_remove_node_removal_hook (node_removal_hook_holder);
> +  node_removal_hook_holder = NULL;
> +  if (node_duplication_hook_holder)
> +    cgraph_remove_node_duplication_hook (node_duplication_hook_holder);
> +  node_duplication_hook_holder = NULL;
> +  VEC_free (inline_summary_t, heap, inline_summary_vec);
>  }
> Index: lto/lto.c
> ===================================================================
> --- lto/lto.c   (revision 172396)
> +++ lto/lto.c   (working copy)
> @@ -44,6 +44,7 @@ along with GCC; see the file COPYING3.
>  #include "lto-streamer.h"
>  #include "splay-tree.h"
>  #include "params.h"
> +#include "ipa-inline.h"
>
>  static GTY(()) tree first_personality_decl;
>
> @@ -750,7 +751,7 @@ add_cgraph_node_to_partition (ltrans_par
>  {
>   struct cgraph_edge *e;
>
> -  part->insns += node->local.inline_summary.self_size;
> +  part->insns += inline_summary (node)->self_size;
>
>   if (node->aux)
>     {
> @@ -811,7 +812,7 @@ undo_partition (ltrans_partition partiti
>       struct cgraph_node *node = VEC_index (cgraph_node_ptr,
>                                            partition->cgraph_set->nodes,
>                                            n_cgraph_nodes);
> -      partition->insns -= node->local.inline_summary.self_size;
> +      partition->insns -= inline_summary (node)->self_size;
>       cgraph_node_set_remove (partition->cgraph_set, node);
>       node->aux = (void *)((size_t)node->aux - 1);
>     }
> Index: lto/Make-lang.in
> ===================================================================
> --- lto/Make-lang.in    (revision 172396)
> +++ lto/Make-lang.in    (working copy)
> @@ -85,7 +85,8 @@ lto/lto.o: lto/lto.c $(CONFIG_H) $(SYSTE
>        $(CGRAPH_H) $(GGC_H) tree-ssa-operands.h $(TREE_PASS_H) \
>        langhooks.h $(VEC_H) $(BITMAP_H) pointer-set.h $(IPA_PROP_H) \
>        $(COMMON_H) debug.h $(TIMEVAR_H) $(GIMPLE_H) $(LTO_H) $(LTO_TREE_H) \
> -       $(LTO_TAGS_H) $(LTO_STREAMER_H) $(SPLAY_TREE_H) gt-lto-lto.h 
> $(PARAMS_H)
> +       $(LTO_TAGS_H) $(LTO_STREAMER_H) $(SPLAY_TREE_H) gt-lto-lto.h 
> $(PARAMS_H) \
> +       ipa-inline.h
>  lto/lto-object.o: lto/lto-object.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
>        $(DIAGNOSTIC_CORE_H) $(LTO_H) $(TM_H) $(LTO_STREAMER_H) \
>        ../include/simple-object.h
> Index: ipa-prop.c
> ===================================================================
> --- ipa-prop.c  (revision 172396)
> +++ ipa-prop.c  (working copy)
> @@ -1998,7 +1998,7 @@ ipa_edge_duplication_hook (struct cgraph
>
>  static void
>  ipa_node_duplication_hook (struct cgraph_node *src, struct cgraph_node *dst,
> -                          __attribute__((unused)) void *data)
> +                          ATTRIBUTE_UNUSED void *data)
>  {
>   struct ipa_node_params *old_info, *new_info;
>   int param_count, i;
> Index: Makefile.in
> ===================================================================
> --- Makefile.in (revision 172396)
> +++ Makefile.in (working copy)
> @@ -3011,7 +3011,7 @@ ipa-ref.o : ipa-ref.c $(CONFIG_H) $(SYST
>  ipa-cp.o : ipa-cp.c $(CONFIG_H) $(SYSTEM_H) coretypes.h  \
>    $(TREE_H) $(TARGET_H) $(GIMPLE_H) $(CGRAPH_H) $(IPA_PROP_H) $(TREE_FLOW_H) 
> \
>    $(TREE_PASS_H) $(FLAGS_H) $(TIMEVAR_H) $(DIAGNOSTIC_H) $(TREE_DUMP_H) \
> -   $(TREE_INLINE_H) $(FIBHEAP_H) $(PARAMS_H) tree-pretty-print.h
> +   $(TREE_INLINE_H) $(FIBHEAP_H) $(PARAMS_H) tree-pretty-print.h ipa-inline.h
>  ipa-split.o : ipa-split.c $(CONFIG_H) $(SYSTEM_H) coretypes.h  \
>    $(TREE_H) $(TARGET_H) $(CGRAPH_H) $(IPA_PROP_H) $(TREE_FLOW_H) \
>    $(TREE_PASS_H) $(FLAGS_H) $(TIMEVAR_H) $(DIAGNOSTIC_H) $(TREE_DUMP_H) \
> @@ -3032,7 +3032,7 @@ ipa-inline-analysis.o : ipa-inline-analy
>    $(TREE_H) langhooks.h $(TREE_INLINE_H) $(FLAGS_H) $(CGRAPH_H) intl.h \
>    $(DIAGNOSTIC_H) $(PARAMS_H) $(TIMEVAR_H) $(TREE_PASS_H) \
>    $(HASHTAB_H) $(COVERAGE_H) $(GGC_H) $(TREE_FLOW_H) $(IPA_PROP_H) \
> -   gimple-pretty-print.h ipa-inline.h
> +   gimple-pretty-print.h ipa-inline.h $(LTO_STREAMER_H)
>  ipa-utils.o : ipa-utils.c $(IPA_UTILS_H) $(CONFIG_H) $(SYSTEM_H) \
>    coretypes.h $(TM_H) $(TREE_H) $(TREE_FLOW_H) $(TREE_INLINE_H) langhooks.h \
>    pointer-set.h $(GGC_H) $(GIMPLE_H) $(SPLAY_TREE_H) \
> Index: lto-streamer.h
> ===================================================================
> --- lto-streamer.h      (revision 172396)
> +++ lto-streamer.h      (working copy)
> @@ -264,6 +264,7 @@ enum lto_section_type
>   LTO_section_symtab,
>   LTO_section_opts,
>   LTO_section_cgraph_opt_sum,
> +  LTO_section_inline_summary,
>   LTO_N_SECTION_TYPES          /* Must be last.  */
>  };
>
>

Reply via email to