On Mon, Nov 14, 2022 at 11:36 AM Manolis Tsamis <manolis.tsa...@vrull.eu> wrote: > > On Mon, Nov 14, 2022 at 9:37 AM Richard Biener > <richard.guent...@gmail.com> wrote: > > > > On Sun, Nov 13, 2022 at 4:38 PM Christoph Muellner > > <christoph.muell...@vrull.eu> wrote: > > > > > > From: mtsamis <manolis.tsa...@vrull.eu> > > > > > > The IPA CP pass offers a wide range of optimizations, where most of them > > > lead to specialized functions that are called from a call site. > > > This can lead to multiple specialized function clones, if more than > > > one call-site allows such an optimization. > > > If not all call-sites can be optimized, the program might end > > > up with call-sites to the original function. > > > > > > This pass assumes that non-optimized call-sites (i.e. call-sites > > > that don't call specialized functions) are likely to be called > > > with arguments that would allow calling specialized clones. > > > Since we cannot guarantee this (for obvious reasons), we can't > > > replace the existing calls. However, we can introduce dynamic > > > guards that test the arguments for the collected constants > > > and calls the specialized function if there is a match. > > > > > > To demonstrate the effect, let's consider the following program part: > > > > > > func_1() > > > myfunc(1) > > > func_2() > > > myfunc(2) > > > func_i(i) > > > myfunc(i) > > > > > > In this case the transformation would do the following: > > > > > > func_1() > > > myfunc.constprop.1() // myfunc() with arg0 == 1 > > > func_2() > > > myfunc.constprop.2() // myfunc() with arg0 == 2 > > > func_i(i) > > > if (i == 1) > > > myfunc.constprop.1() // myfunc() with arg0 == 1 > > > else if (i == 2) > > > myfunc.constprop.2() // myfunc() with arg0 == 2 > > > else > > > myfunc(i) > > > > > > The pass consists of two main parts: > > > * collecting all specialized functions and the argument/constant pair(s) > > > * insertion of the guards during materialization > > > > > > The patch integrates well into ipa-cp and related IPA functionality. > > > Given the nature of IPA, the changes are touching many IPA-related > > > files as well as call-graph data structures. > > > > > > The impact of the dynamic guard is expected to be less than the speedup > > > gained by enabled optimizations (e.g. inlining or constant propagation). > > > > I don't see any limits on the number of callee candidates or the complexity > > of the guard. Is there any reason to not factor the guards into a wrapper > > function to avoid bloating cold call sites and to allow inlining to decide > > where the expansion is useful? > > > > There is indeed no limit on the numbers of guards or guard complexity > currently. Would it be a good choice here to introduce two parameters > for the maximum number of guards and conditions per guard and assign > some sane default value?
Yes, that sounds good. > About the wrapper functions, that is an interesting question that I haven't > explored as much. One reason is that this transformation aims to work in > a similar way as the speculative edges (which already existed). Since the > expected number of guards is low (1-2 in most cases), I considered the > two optimizations quite similar and wanted to share as much of the design > and functionality as I could. I also tried to make the overhead of the > non-specialized original function call as low as possible. > > But I can also see how there is a difference in the speculative and > specialized edges that make creating a wrapper meaningful for > this case: The maximum speedup of a direct vs indirect function > call can be much smaller than that of a specialized call instead > of the generic one. Yes - IIRC there was also the idea to generally wrap not specialized calls or to modify the not specialized copy itself. > > Skimming the patch I noticed an #if 0 commented assert with a comment > > that this was to be temporary? > > > > Thanks for pointing that out, this is unintentional. I will fix it. > > Best, > Manolis > > > Thanks, > > Richard. > > > > > PR ipa/107667 > > > gcc/Changelog: > > > > > > * cgraph.cc (cgraph_add_edge_to_call_site_hash): Add support for > > > guarded specialized edges. > > > (cgraph_edge::set_call_stmt): Likewise. > > > (symbol_table::create_edge): Likewise. > > > (cgraph_edge::remove): Likewise. > > > (cgraph_edge::make_speculative): Likewise. > > > (cgraph_edge::make_specialized): Likewise. > > > (cgraph_edge::remove_specializations): Likewise. > > > (cgraph_edge::redirect_call_stmt_to_callee): Likewise. > > > (cgraph_edge::dump_edge_flags): Likewise. > > > (verify_speculative_call): Likewise. > > > (verify_specialized_call): Likewise. > > > (cgraph_node::verify_node): Likewise. > > > * cgraph.h (class GTY): Add new class that contains info of > > > specialized edges. > > > * cgraphclones.cc (cgraph_edge::clone): Add support for guarded > > > specialized edges. > > > (cgraph_node::set_call_stmt_including_clones): Likewise. > > > * ipa-cp.cc (want_remove_some_param_p): Likewise. > > > (create_specialized_node): Likewise. > > > (add_specialized_edges): Likewise. > > > (ipcp_driver): Likewise. > > > * ipa-fnsummary.cc (redirect_to_unreachable): Likewise. > > > (ipa_fn_summary_t::duplicate): Likewise. > > > (analyze_function_body): Likewise. > > > (estimate_edge_size_and_time): Likewise. > > > (remap_edge_summaries): Likewise. > > > * ipa-inline-transform.cc (inline_transform): Likewise. > > > * ipa-inline.cc (edge_badness): Likewise. > > > lto-cgraph.cc (lto_output_edge): Likewise. > > > (input_edge): Likewise. > > > * tree-inline.cc (copy_bb): Likewise. > > > * value-prof.cc (gimple_sc): Add function to create guarded > > > specializations. > > > * value-prof.h (gimple_sc): Likewise. > > > > > > Signed-off-by: Manolis Tsamis <manolis.tsa...@vrull.eu> > > > --- > > > gcc/cgraph.cc | 316 +++++++++++++++++++++++++++++++++++- > > > gcc/cgraph.h | 102 ++++++++++++ > > > gcc/cgraphclones.cc | 30 ++++ > > > gcc/common.opt | 4 + > > > gcc/ipa-cp.cc | 105 ++++++++++++ > > > gcc/ipa-fnsummary.cc | 42 +++++ > > > gcc/ipa-inline-transform.cc | 11 ++ > > > gcc/ipa-inline.cc | 5 + > > > gcc/lto-cgraph.cc | 46 ++++++ > > > gcc/tree-inline.cc | 54 ++++++ > > > gcc/value-prof.cc | 214 ++++++++++++++++++++++++ > > > gcc/value-prof.h | 1 + > > > 12 files changed, 923 insertions(+), 7 deletions(-) > > > > > > diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc > > > index 5851b2ffc6c..ee819c87261 100644 > > > --- a/gcc/cgraph.cc > > > +++ b/gcc/cgraph.cc > > > @@ -718,18 +718,24 @@ cgraph_add_edge_to_call_site_hash (cgraph_edge *e) > > > one indirect); always hash the direct one. */ > > > if (e->speculative && e->indirect_unknown_callee) > > > return; > > > + /* There are potentially multiple specialization edges for every > > > + specialized call; always hash the base egde. */ > > > + if (e->guarded_specialization_edge_p ()) > > > + return; > > > cgraph_edge **slot = e->caller->call_site_hash->find_slot_with_hash > > > (e->call_stmt, cgraph_edge_hasher::hash (e->call_stmt), INSERT); > > > if (*slot) > > > { > > > - gcc_assert (((cgraph_edge *)*slot)->speculative); > > > + gcc_assert (((cgraph_edge *)*slot)->speculative > > > + || ((cgraph_edge *)*slot)->specialized); > > > if (e->callee && (!e->prev_callee > > > || !e->prev_callee->speculative > > > + || !e->prev_callee->specialized > > > || e->prev_callee->call_stmt != e->call_stmt)) > > > *slot = e; > > > return; > > > } > > > - gcc_assert (!*slot || e->speculative); > > > + gcc_assert (!*slot || e->speculative || e->specialized); > > > *slot = e; > > > } > > > > > > @@ -800,6 +806,23 @@ cgraph_edge::set_call_stmt (cgraph_edge *e, gcall > > > *new_stmt, > > > gcc_checking_assert (new_direct_callee); > > > } > > > > > > + /* Update specialized first and do not return yet in case we're dealing > > > + with an edge that is both specialized and speculative. */ > > > + if (update_speculative && e->specialized) > > > + { > > > + cgraph_edge *next, *base = e->specialized_call_base_edge (); > > > + for (cgraph_edge *d = e->first_specialized_call_target (); d; d = > > > next) > > > + { > > > + next = d->next_specialized_call_target (); > > > + cgraph_edge *d2 = set_call_stmt (d, new_stmt, false); > > > + gcc_assert (d2 == d); > > > + } > > > + > > > + /* Don't update base for speculative edges. The code below will. > > > */ > > > + if (!e->speculative) > > > + set_call_stmt (base, new_stmt, false); > > > + } > > > + > > > /* Speculative edges has three component, update all of them > > > when asked to. */ > > > if (update_speculative && e->speculative > > > @@ -835,12 +858,16 @@ cgraph_edge::set_call_stmt (cgraph_edge *e, gcall > > > *new_stmt, > > > return e_indirect ? indirect : direct; > > > } > > > > > > + if (update_speculative && e->specialized) > > > + return e; > > > + > > > if (new_direct_callee) > > > e = make_direct (e, new_direct_callee); > > > > > > /* Only direct speculative edges go to call_site_hash. */ > > > if (e->caller->call_site_hash > > > && (!e->speculative || !e->indirect_unknown_callee) > > > + && (!e->specialized || e->spec_args == NULL) > > > /* It is possible that edge was previously speculative. In this > > > case > > > we have different value in call stmt hash which needs > > > preserving. */ > > > && e->caller->get_edge (e->call_stmt) == e) > > > @@ -854,11 +881,12 @@ cgraph_edge::set_call_stmt (cgraph_edge *e, gcall > > > *new_stmt, > > > /* Update call stite hash. For speculative calls we only record the > > > first > > > direct edge. */ > > > if (e->caller->call_site_hash > > > - && (!e->speculative > > > + && ((!e->speculative && !e->specialized) > > > || (e->callee > > > && (!e->prev_callee || !e->prev_callee->speculative > > > || e->prev_callee->call_stmt != e->call_stmt)) > > > - || (e->speculative && !e->callee))) > > > + || (e->speculative && !e->callee) > > > + || e->base_specialization_edge_p ())) > > > cgraph_add_edge_to_call_site_hash (e); > > > return e; > > > } > > > @@ -883,7 +911,8 @@ symbol_table::create_edge (cgraph_node *caller, > > > cgraph_node *callee, > > > construction of call stmt hashtable. */ > > > cgraph_edge *e; > > > gcc_checking_assert (!(e = caller->get_edge (call_stmt)) > > > - || e->speculative); > > > + || e->speculative > > > + || e->specialized); > > > > > > gcc_assert (is_gimple_call (call_stmt)); > > > } > > > @@ -909,6 +938,8 @@ symbol_table::create_edge (cgraph_node *caller, > > > cgraph_node *callee, > > > edge->indirect_info = NULL; > > > edge->indirect_inlining_edge = 0; > > > edge->speculative = false; > > > + edge->specialized = false; > > > + edge->spec_args = NULL; > > > edge->indirect_unknown_callee = indir_unknown_callee; > > > if (call_stmt && caller->call_site_hash) > > > cgraph_add_edge_to_call_site_hash (edge); > > > @@ -1066,6 +1097,11 @@ symbol_table::free_edge (cgraph_edge *e) > > > void > > > cgraph_edge::remove (cgraph_edge *edge) > > > { > > > + /* If we remove the base edge of a group of specialized > > > + edges then we must also remove all of its specializations. */ > > > + if (edge->base_specialization_edge_p ()) > > > + cgraph_edge::remove_specializations (edge); > > > + > > > /* Call all edge removal hooks. */ > > > symtab->call_edge_removal_hooks (edge); > > > > > > @@ -1109,6 +1145,8 @@ cgraph_edge::make_speculative (cgraph_node *n2, > > > profile_count direct_count, > > > ipa_ref *ref = NULL; > > > cgraph_edge *e2; > > > > > > + gcc_checking_assert (!specialized); > > > + > > > if (dump_file) > > > fprintf (dump_file, "Indirect call -> speculative call %s => %s\n", > > > n->dump_name (), n2->dump_name ()); > > > @@ -1134,6 +1172,62 @@ cgraph_edge::make_speculative (cgraph_node *n2, > > > profile_count direct_count, > > > return e2; > > > } > > > > > > +/* Mark this edge as specialized and add a new edge representing that N2 > > > + is a specialized version of the CALLE of this edge, with the > > > specialized > > > + arguments found in SPEC_ARGS. */ > > > +cgraph_edge * > > > +cgraph_edge::make_specialized (cgraph_node *n2, > > > + vec<cgraph_specialization_info>* > > > spec_args, > > > + profile_count spec_count) > > > +{ > > > + if (speculative) > > > + { > > > + /* Because both speculative and specialized edges use CALL_STMT and > > > + LTO_STMT_UID to link edges together there is a limitation in > > > + specializing speculative edges. Only one group of specialized > > > + edges can exist for a given group of speculative edges. */ > > > + for (cgraph_edge *direct = first_speculative_call_target (); > > > + direct; direct = direct->next_speculative_call_target ()) > > > + if (direct != this && direct->specialized) > > > + return NULL; > > > + } > > > + > > > + cgraph_node *n = caller; > > > + cgraph_edge *e2; > > > + > > > + if (dump_file) > > > + fprintf (dump_file, "Creating guarded specialized edge %s -> %s " > > > + "from%s callee %s\n", > > > + caller->dump_name (), n2->dump_name (), > > > + (speculative? " speculative" : ""), > > > + callee->dump_name ()); > > > + specialized = true; > > > + e2 = n->create_edge (n2, call_stmt, spec_count); > > > + > > > + /* We don't want to inline the specialized edges seperately. If the > > > base > > > + specialized edge is inlined then we will drop the specializations. > > > */ > > > + e2->inline_failed = CIF_UNSPECIFIED; > > > + if (TREE_NOTHROW (n2->decl)) > > > + e2->can_throw_external = false; > > > + else > > > + e2->can_throw_external = can_throw_external; > > > + > > > + e2->specialized = true; > > > + > > > + unsigned i; > > > + cgraph_specialization_info* spec_info; > > > + vec_alloc (e2->spec_args, spec_args->length ()); > > > + > > > + FOR_EACH_VEC_ELT (*spec_args, i, spec_info) > > > + e2->spec_args->quick_push (*spec_info); > > > + > > > + e2->lto_stmt_uid = lto_stmt_uid; > > > + e2->in_polymorphic_cdtor = in_polymorphic_cdtor; > > > + count -= e2->count; > > > + symtab->call_edge_duplication_hooks (this, e2); > > > + return e2; > > > +} > > > + > > > /* Speculative call consists of an indirect edge and one or more > > > direct edge+ref pairs. > > > > > > @@ -1364,6 +1458,39 @@ cgraph_edge::make_direct (cgraph_edge *edge, > > > cgraph_node *callee) > > > return edge; > > > } > > > > > > +/* Given the base edge of a group of specialized edges remove all its > > > + specialized edges. Essentially this can be used to undo the descision > > > + to specialize EDGE. */ > > > + > > > +void > > > +cgraph_edge::remove_specializations (cgraph_edge *edge) > > > +{ > > > + if (!edge->specialized) > > > + return; > > > + > > > + if (edge->base_specialization_edge_p ()) > > > + { > > > + cgraph_edge *next; > > > + for (cgraph_edge *e2 = edge->caller->callees; e2; e2 = next) > > > + { > > > + next = e2->next_callee; > > > + > > > + if (e2->guarded_specialization_edge_p () > > > + && edge->call_stmt == e2->call_stmt > > > + && edge->lto_stmt_uid == e2->lto_stmt_uid) > > > + { > > > + edge->count += e2->count; > > > + if (e2->inline_failed) > > > + remove (e2); > > > + else > > > + e2->callee->remove_symbol_and_inline_clones (); > > > + } > > > + } > > > + } > > > + else > > > + gcc_checking_assert (false); > > > +} > > > + > > > /* Redirect callee of the edge to N. The function does not update > > > underlying > > > call expression. */ > > > > > > @@ -1411,6 +1538,7 @@ cgraph_edge::redirect_call_stmt_to_callee > > > (cgraph_edge *e) > > > { > > > tree decl = gimple_call_fndecl (e->call_stmt); > > > gcall *new_stmt; > > > + bool remove_specializations_if_base = true; > > > > > > if (e->speculative) > > > { > > > @@ -1467,6 +1595,27 @@ cgraph_edge::redirect_call_stmt_to_callee > > > (cgraph_edge *e) > > > /* Indirect edges are not both in the call site hash. > > > get it updated. */ > > > update_call_stmt_hash_for_removing_direct_edge (e, indirect); > > > + > > > + if (e->specialized) > > > + { > > > + gcc_checking_assert (e->base_specialization_edge_p ()); > > > + > > > + /* If we're materializing a speculative and base > > > specialized edge > > > + then we want to keep the specializations alive. This > > > amounts > > > + to changing the call statements of the guarded > > > + specializations. */ > > > + remove_specializations_if_base = false; > > > + cgraph_edge *next; > > > + > > > + for (cgraph_edge *d = e->first_specialized_call_target (); > > > + d; d = next) > > > + { > > > + next = d->next_specialized_call_target (); > > > + cgraph_edge *d2 = set_call_stmt (d, new_stmt, false); > > > + gcc_assert (d2 == d); > > > + } > > > + } > > > + > > > cgraph_edge::set_call_stmt (e, new_stmt, false); > > > e->count = gimple_bb (e->call_stmt)->count; > > > > > > @@ -1482,6 +1631,53 @@ cgraph_edge::redirect_call_stmt_to_callee > > > (cgraph_edge *e) > > > } > > > } > > > > > > + if (e->specialized) > > > + { > > > + if (e->spec_args != NULL) > > > + { > > > + /* Be sure we redirect all specialized targets before poking > > > + about base edge. */ > > > + cgraph_edge *base = e->specialized_call_base_edge (); > > > + gcall *new_stmt; > > > + > > > + /* Expand specialization into GIMPLE code. */ > > > + if (dump_file) > > > + fprintf (dump_file, > > > + "Expanding specialized call of %s -> %s\n", > > > + e->caller->dump_name (), e->callee->dump_name ()); > > > + > > > + push_cfun (DECL_STRUCT_FUNCTION (e->caller->decl)); > > > + > > > + profile_count all = base->count; > > > + for (cgraph_edge *e2 = e->first_specialized_call_target (); > > > + e2; e2 = e2->next_specialized_call_target ()) > > > + all = all + e2->count; > > > + > > > + profile_probability prob = e->count.probability_in (all); > > > + if (!prob.initialized_p ()) > > > + prob = profile_probability::even (); > > > + > > > + new_stmt = gimple_sc (e, prob); > > > + e->specialized = false; > > > + e->spec_args = NULL; > > > + if (!base->first_specialized_call_target ()) > > > + base->specialized = false; > > > + > > > + cgraph_edge::set_call_stmt (e, new_stmt, false); > > > + e->count = gimple_bb (e->call_stmt)->count; > > > + /* Once we are done with expanding the sequence, update also > > > base > > > + call probability. Until then the basic block accounts for > > > the > > > + sum of specialized edges and all non-expanded > > > specializations. */ > > > + if (!base->specialized) > > > + base->count = gimple_bb (base->call_stmt)->count; > > > + > > > + pop_cfun (); > > > + } > > > + else if (remove_specializations_if_base) > > > + /* The specialized edges are in part connected by CALL_STMT so if > > > + we change it for the base edge then remove all > > > specializations. */ > > > + cgraph_edge::remove_specializations (e); > > > + } > > > > > > if (e->indirect_unknown_callee > > > || decl == e->callee->decl) > > > @@ -2069,6 +2265,10 @@ cgraph_edge::dump_edge_flags (FILE *f) > > > { > > > if (speculative) > > > fprintf (f, "(speculative) "); > > > + if (base_specialization_edge_p ()) > > > + fprintf (f, "(specialized base) "); > > > + if (guarded_specialization_edge_p ()) > > > + fprintf (f, "(guarded specialization) "); > > > if (!inline_failed) > > > fprintf (f, "(inlined) "); > > > if (call_stmt_cannot_inline_p) > > > @@ -3313,6 +3513,10 @@ verify_speculative_call (struct cgraph_node *node, > > > gimple *stmt, > > > direct = direct->next_callee) > > > if (direct->call_stmt == stmt && direct->lto_stmt_uid == > > > lto_stmt_uid) > > > { > > > + /* Guarded specialized edges share the same CALL_STMT and > > > LTO_STMT_UID > > > + but are handled separately. */ > > > + if (direct->guarded_specialization_edge_p ()) > > > + continue; > > > if (!first_call) > > > first_call = direct; > > > if (prev_call && direct != prev_call->next_callee) > > > @@ -3402,6 +3606,93 @@ verify_speculative_call (struct cgraph_node *node, > > > gimple *stmt, > > > return false; > > > } > > > > > > +/* Verify consistency of specialized call in NODE corresponding to STMT > > > + and LTO_STMT_UID. If BASE is set, assume that it is the base > > > + edge of call sequence. Return true if error is found. > > > + > > > + This function is called to every component of specialized call (base > > > edge > > > + and specialized edges). To save duplicated work, do full testing only > > > + when testing the base edge. */ > > > +static bool > > > +verify_specialized_call (struct cgraph_node *node, gimple *stmt, > > > + unsigned int lto_stmt_uid, > > > + struct cgraph_edge *base) > > > +{ > > > + if (base == NULL) > > > + { > > > + cgraph_edge *base; > > > + for (base = node->callees; base; > > > + base = base->next_callee) > > > + if (base->call_stmt == stmt > > > + && base->lto_stmt_uid == lto_stmt_uid > > > + && base->spec_args == NULL) > > > + break; > > > + if (!base) > > > + { > > > + error ("missing base call in specialized call sequence"); > > > + return true; > > > + } > > > + if (!base->specialized) > > > + { > > > + error ("base call in specialized call sequence has no " > > > + "specialized flag"); > > > + return true; > > > + } > > > + for (base = base->next_callee; base; > > > + base = base->next_callee) > > > + if (base->call_stmt == stmt > > > + && base->lto_stmt_uid == lto_stmt_uid > > > + && base->spec_args == NULL) > > > + { > > > + error ("cannot have more than one base edge in specialized " > > > + "call sequence"); > > > + return true; > > > + } > > > + return false; > > > + } > > > + > > > + cgraph_edge *prev_call = NULL; > > > + > > > + cgraph_node *origin_base = base->callee; > > > + while (origin_base->clone_of) > > > + origin_base = origin_base->clone_of; > > > + > > > + for (cgraph_edge *spec = node->callees; spec; > > > + spec = spec->next_callee) > > > + if (spec->call_stmt == stmt > > > + && spec->lto_stmt_uid == lto_stmt_uid > > > + && spec->spec_args != NULL) > > > + { > > > + cgraph_node *origin_spec = spec->callee; > > > + while (origin_spec->clone_of) > > > + origin_spec = origin_spec->clone_of; > > > + > > > + if (spec->callee->clone_of && origin_base != origin_spec) > > > + { > > > + error ("specialized call to %s in specialized call sequence > > > has " > > > + "different origin than base %s %s %s", > > > + origin_spec->dump_name (), origin_base->dump_name (), > > > + base->callee->dump_name (), spec->callee->dump_name > > > ()); > > > + return true; > > > + } > > > + > > > + if (prev_call && spec != prev_call->next_callee) > > > + { > > > + error ("specialized edges are not adjacent"); > > > + return true; > > > + } > > > + prev_call = spec; > > > + if (!spec->specialized) > > > + { > > > + error ("call to %s in specialized call sequence has no " > > > + "specialized flag", spec->callee->dump_name ()); > > > + return true; > > > + } > > > + } > > > + > > > + return false; > > > +} > > > + > > > /* Verify cgraph nodes of given cgraph node. */ > > > DEBUG_FUNCTION void > > > cgraph_node::verify_node (void) > > > @@ -3578,6 +3869,7 @@ cgraph_node::verify_node (void) > > > if (gimple_has_body_p (e->caller->decl) > > > && !e->caller->inlined_to > > > && !e->speculative > > > + && !e->specialized > > > /* Optimized out calls are redirected to __builtin_unreachable. > > > */ > > > && (e->count.nonzero_p () > > > || ! e->callee->decl > > > @@ -3604,6 +3896,10 @@ cgraph_node::verify_node (void) > > > && verify_speculative_call (e->caller, e->call_stmt, > > > e->lto_stmt_uid, > > > NULL)) > > > error_found = true; > > > + if (e->specialized > > > + && verify_specialized_call (e->caller, e->call_stmt, > > > e->lto_stmt_uid, > > > + e->spec_args == NULL? e : NULL)) > > > + error_found = true; > > > } > > > for (e = indirect_calls; e; e = e->next_callee) > > > { > > > @@ -3612,6 +3908,7 @@ cgraph_node::verify_node (void) > > > if (gimple_has_body_p (e->caller->decl) > > > && !e->caller->inlined_to > > > && !e->speculative > > > + && !e->specialized > > > && e->count.ipa_p () > > > && count > > > == ENTRY_BLOCK_PTR_FOR_FN (DECL_STRUCT_FUNCTION > > > (decl))->count > > > @@ -3630,6 +3927,11 @@ cgraph_node::verify_node (void) > > > && verify_speculative_call (e->caller, e->call_stmt, > > > e->lto_stmt_uid, > > > e)) > > > error_found = true; > > > + if (e->specialized || e->spec_args != NULL) > > > + { > > > + error ("Cannot have specialized edges in indirect call"); > > > + error_found = true; > > > + } > > > } > > > for (i = 0; iterate_reference (i, ref); i++) > > > { > > > @@ -3824,7 +4126,7 @@ cgraph_node::verify_node (void) > > > > > > for (e = callees; e; e = e->next_callee) > > > { > > > - if (!e->aux && !e->speculative) > > > + if (!e->aux && !e->speculative && !e->specialized) > > > { > > > error ("edge %s->%s has no corresponding call_stmt", > > > identifier_to_locale (e->caller->name ()), > > > @@ -3836,7 +4138,7 @@ cgraph_node::verify_node (void) > > > } > > > for (e = indirect_calls; e; e = e->next_callee) > > > { > > > - if (!e->aux && !e->speculative) > > > + if (!e->aux && !e->speculative && !e->specialized) > > > { > > > error ("an indirect edge from %s has no corresponding > > > call_stmt", > > > identifier_to_locale (e->caller->name ())); > > > diff --git a/gcc/cgraph.h b/gcc/cgraph.h > > > index 4be67e3cea9..4caed96e803 100644 > > > --- a/gcc/cgraph.h > > > +++ b/gcc/cgraph.h > > > @@ -1683,6 +1683,19 @@ public: > > > unsigned vptr_changed : 1; > > > }; > > > > > > +class GTY (()) cgraph_specialization_info > > > +{ > > > +public: > > > + unsigned arg_idx; > > > + int is_unsigned; /* Whether the specialization constant is unsigned. > > > */ > > > + union > > > + { > > > + HOST_WIDE_INT GTY ((tag ("0"))) sval; > > > + unsigned HOST_WIDE_INT GTY ((tag ("1"))) uval; > > > + } > > > + GTY ((desc ("%1.is_unsigned"))) cst; > > > +}; > > > + > > > class GTY((chain_next ("%h.next_caller"), chain_prev ("%h.prev_caller"), > > > for_user)) cgraph_edge > > > { > > > @@ -1723,6 +1736,12 @@ public: > > > */ > > > cgraph_edge *make_speculative (cgraph_node *n2, profile_count > > > direct_count, > > > unsigned int speculative_id = 0); > > > + /* Mark that this edge represents a specialized call to N2. > > > + SPEC_ARGS represent the position and values of the CALL_STMT of > > > this edge > > > + that are specialized in N2. */ > > > + cgraph_edge *make_specialized (cgraph_node *n2, > > > + vec<cgraph_specialization_info> > > > *spec_args, > > > + profile_count spec_count); > > > > > > /* Speculative call consists of an indirect edge and one or more > > > direct edge+ref pairs. Speculative will expand to the following > > > sequence: > > > @@ -1802,6 +1821,66 @@ public: > > > gcc_unreachable (); > > > } > > > > > > + /* Return the first edge that represents a specialization of the > > > CALL_STMT > > > + of this edge if one exists or NULL otherwise. */ > > > + cgraph_edge *first_specialized_call_target () > > > + { > > > + gcc_checking_assert (specialized && callee); > > > + for (cgraph_edge *e2 = caller->callees; > > > + e2; e2 = e2->next_callee) > > > + if (e2->guarded_specialization_edge_p () > > > + && call_stmt == e2->call_stmt > > > + && lto_stmt_uid == e2->lto_stmt_uid) > > > + return e2; > > > + > > > + return NULL; > > > + } > > > + > > > + /* Return the next edge that represents a specialization of the > > > CALL_STMT > > > + of this edge if one exists or NULL otherwise. */ > > > + cgraph_edge *next_specialized_call_target () > > > + { > > > + cgraph_edge *e = this; > > > + gcc_checking_assert (specialized && callee); > > > + > > > + if (e->next_callee > > > + && e->next_callee->guarded_specialization_edge_p () > > > + && e->next_callee->call_stmt == e->call_stmt > > > + && e->next_callee->lto_stmt_uid == e->lto_stmt_uid) > > > + return e->next_callee; > > > + return NULL; > > > + } > > > + > > > + /* When called on any edge in a specialized call return the (unique) > > > + edge that points to the non specialized function. */ > > > + cgraph_edge *specialized_call_base_edge () > > > + { > > > + gcc_checking_assert (specialized && callee); > > > + for (cgraph_edge *e2 = caller->callees; > > > + e2; e2 = e2->next_callee) > > > + if (e2->base_specialization_edge_p () > > > + && call_stmt == e2->call_stmt > > > + && lto_stmt_uid == e2->lto_stmt_uid) > > > + return e2; > > > + > > > + return NULL; > > > + } > > > + > > > + /* Return true iff this edge is part of specialized sequence and is the > > > + original edge for which other specialization edges potentially > > > exist. */ > > > + bool base_specialization_edge_p () const > > > + { > > > + return specialized && spec_args == NULL; > > > + } > > > + > > > + /* Return true iff this edge is part of specialized sequence and it > > > + represents a potential specialization target that canbe used instead > > > + of the base edge. */ > > > + bool guarded_specialization_edge_p () const > > > + { > > > + return specialized && spec_args != NULL; > > > + } > > > + > > > /* Speculative call edge turned out to be direct call to CALLEE_DECL. > > > Remove > > > the speculative call sequence and return edge representing the > > > call, the > > > original EDGE can be removed and deallocated. It is up to caller to > > > @@ -1820,6 +1899,11 @@ public: > > > static cgraph_edge *resolve_speculation (cgraph_edge *edge, > > > tree callee_decl = NULL); > > > > > > + /* Given the base edge of a group of specialized edges remove all its > > > + specialized edges. Essentially this can be used to undo the > > > descision > > > + to specialize EDGE. */ > > > + static void remove_specializations (cgraph_edge *edge); > > > + > > > /* If necessary, change the function declaration in the call statement > > > associated with edge E so that it corresponds to the edge callee. > > > Speculations can be resolved in the process and EDGE can be removed > > > and > > > @@ -1895,6 +1979,9 @@ public: > > > /* Additional information about an indirect call. Not cleared when an > > > edge > > > becomes direct. */ > > > cgraph_indirect_call_info *indirect_info; > > > + /* If this edge has a specialized function as a callee then this vector > > > + holds the indices and values of the specialized arguments. */ > > > + vec<cgraph_specialization_info>* GTY ((skip (""))) spec_args; > > > void *GTY ((skip (""))) aux; > > > /* When equal to CIF_OK, inline this call. Otherwise, points to the > > > explanation why function was not inlined. */ > > > @@ -1933,6 +2020,21 @@ public: > > > Optimizers may later redirect direct call to clone, so 1) and 3) > > > do not need to necessarily agree with destination. */ > > > unsigned int speculative : 1; > > > + /* Edges with SPECIALIZED flag represents calls that have additional > > > + specialized functions that can be used instead (as a result of > > > ipa-cp). > > > + The final code sequence will have form: > > > + > > > + if (specialized_arg_0 == specialized_const_0 > > > + && ... > > > + && specialized_arg_i == specialized_const_i) > > > + call_target.constprop.N (non_specialized_arg_0, ...); > > > + ... > > > + more potential specializations > > > + ... > > > + else > > > + call_target (); > > > + */ > > > + unsigned int specialized : 1; > > > /* Set to true when caller is a constructor or destructor of > > > polymorphic > > > type. */ > > > unsigned in_polymorphic_cdtor : 1; > > > diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc > > > index bb4b3c5407d..9e12fa19180 100644 > > > --- a/gcc/cgraphclones.cc > > > +++ b/gcc/cgraphclones.cc > > > @@ -141,6 +141,20 @@ cgraph_edge::clone (cgraph_node *n, gcall > > > *call_stmt, unsigned stmt_uid, > > > new_edge->can_throw_external = can_throw_external; > > > new_edge->call_stmt_cannot_inline_p = call_stmt_cannot_inline_p; > > > new_edge->speculative = speculative; > > > + > > > + new_edge->specialized = specialized; > > > + new_edge->spec_args = NULL; > > > + > > > + if (spec_args) > > > + { > > > + unsigned i; > > > + cgraph_specialization_info* spec_info; > > > + vec_alloc (new_edge->spec_args, spec_args->length ()); > > > + > > > + FOR_EACH_VEC_ELT (*spec_args, i, spec_info) > > > + new_edge->spec_args->quick_push (*spec_info); > > > + } > > > + > > > new_edge->in_polymorphic_cdtor = in_polymorphic_cdtor; > > > > > > /* Update IPA profile. Local profiles need no updating in original. > > > */ > > > @@ -791,6 +805,22 @@ cgraph_node::set_call_stmt_including_clones (gimple > > > *old_stmt, > > > } > > > indirect->speculative = false; > > > } > > > + > > > + if (edge->specialized && !update_speculative) > > > + { > > > + cgraph_edge *base = edge->specialized_call_base_edge (); > > > + > > > + for (cgraph_edge *next, *specialized > > > + = edge->first_specialized_call_target (); > > > + specialized; > > > + specialized = next) > > > + { > > > + next = specialized->next_specialized_call_target (); > > > + specialized->specialized = false; > > > + } > > > + base->specialized = false; > > > + } > > > + > > > } > > > if (node->clones) > > > node = node->clones; > > > diff --git a/gcc/common.opt b/gcc/common.opt > > > index bce3e514f65..437f2f4295b 100644 > > > --- a/gcc/common.opt > > > +++ b/gcc/common.opt > > > @@ -1933,6 +1933,10 @@ fipa-bit-cp > > > Common Var(flag_ipa_bit_cp) Optimization > > > Perform interprocedural bitwise constant propagation. > > > > > > +fipa-guarded-specialization > > > +Common Var(flag_ipa_guarded_specialization) Optimization > > > +Add speculative edges for existing specialized functions. > > > + > > > fipa-modref > > > Common Var(flag_ipa_modref) Optimization > > > Perform interprocedural modref analysis. > > > diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc > > > index d2bcd5e5e69..5a24f6987ac 100644 > > > --- a/gcc/ipa-cp.cc > > > +++ b/gcc/ipa-cp.cc > > > @@ -119,6 +119,7 @@ along with GCC; see the file COPYING3. If not see > > > #include "symbol-summary.h" > > > #include "tree-vrp.h" > > > #include "ipa-prop.h" > > > +#include "gimple-pretty-print.h" > > > #include "tree-pretty-print.h" > > > #include "tree-inline.h" > > > #include "ipa-fnsummary.h" > > > @@ -5239,6 +5240,8 @@ want_remove_some_param_p (cgraph_node *node, > > > vec<tree> known_csts) > > > return false; > > > } > > > > > > +static hash_map<cgraph_node*, vec<cgraph_node*>> > > > *available_specializations; > > > + > > > /* Create a specialized version of NODE with known constants in > > > KNOWN_CSTS, > > > known contexts in KNOWN_CONTEXTS and known aggregate values in > > > AGGVALS and > > > redirect all edges in CALLERS to it. */ > > > @@ -5409,6 +5412,13 @@ create_specialized_node (struct cgraph_node *node, > > > new_info->known_csts = known_csts; > > > new_info->known_contexts = known_contexts; > > > > > > + if (!info->ipcp_orig_node) > > > + { > > > + vec<cgraph_node*> &spec_nodes > > > + = available_specializations->get_or_insert (node); > > > + spec_nodes.safe_push (new_node); > > > + } > > > + > > > ipcp_discover_new_direct_edges (new_node, known_csts, known_contexts, > > > aggvals); > > > > > > @@ -6538,6 +6548,96 @@ ipcp_store_vr_results (void) > > > } > > > } > > > > > > +/* Add new edges to the call graph to represent the available > > > specializations > > > + of each specialized function. */ > > > +static void > > > +add_specialized_edges (void) > > > +{ > > > + cgraph_edge *e; > > > + cgraph_node *n, *spec_n; > > > + tree spec_v; > > > + unsigned i, j; > > > + > > > + FOR_EACH_DEFINED_FUNCTION (n) > > > + { > > > + if (dump_file && n->callees) > > > + fprintf (dump_file, > > > + "Procesing function %s for specialization of edges.\n", > > > + n->dump_name ()); > > > + > > > + if (n->ipcp_clone) > > > + continue; > > > + > > > + bool update = false; > > > + for (e = n->callees; e; e = e->next_callee) > > > + { > > > + if (!e->callee || e->recursive_p ()) > > > + continue; > > > + > > > + vec<cgraph_node*> *specialization_nodes > > > + = available_specializations->get (e->callee); > > > + > > > + if (!specialization_nodes) > > > + continue; > > > + > > > + FOR_EACH_VEC_ELT (*specialization_nodes, i, spec_n) > > > + { > > > + if (dump_file) > > > + fprintf (dump_file, > > > + "Edge has available specialization %s.\n", > > > + spec_n->dump_name ()); > > > + > > > + ipa_node_params *spec_params = ipa_node_params_sum->get > > > (spec_n); > > > + vec<cgraph_specialization_info> replaced_args = vNULL; > > > + bool failed = false; > > > + > > > + FOR_EACH_VEC_ELT (spec_params->known_csts, j, spec_v) > > > + { > > > + if (spec_v != NULL_TREE) > > > + { > > > + if (TREE_CODE (spec_v) == INTEGER_CST > > > + && TYPE_UNSIGNED (TREE_TYPE (spec_v)) > > > + && tree_fits_uhwi_p (spec_v)) > > > + { > > > + cgraph_specialization_info spec_info; > > > + spec_info.arg_idx = j; > > > + spec_info.is_unsigned = 1; > > > + spec_info.cst.uval = tree_to_uhwi (spec_v); > > > + replaced_args.safe_push (spec_info); > > > + } > > > + else if (TREE_CODE (spec_v) == INTEGER_CST > > > + && !TYPE_UNSIGNED (TREE_TYPE (spec_v)) > > > + && tree_fits_shwi_p (spec_v)) > > > + { > > > + cgraph_specialization_info spec_info; > > > + spec_info.arg_idx = j; > > > + spec_info.is_unsigned = 0; > > > + spec_info.cst.uval = tree_to_shwi (spec_v); > > > + replaced_args.safe_push (spec_info); > > > + } > > > + else > > > + { > > > + failed = true; > > > + break; > > > + } > > > + } > > > + } > > > + > > > + if (!failed && replaced_args.length () > 0) > > > + { > > > + if (e->make_specialized (spec_n, > > > + &replaced_args, > > > + e->count.apply_scale (1, 10))) > > > + update = true; > > > + } > > > + } > > > + } > > > + > > > + if (update) > > > + ipa_update_overall_fn_summary (n); > > > + } > > > +} > > > + > > > /* The IPCP driver. */ > > > > > > static unsigned int > > > @@ -6551,6 +6651,7 @@ ipcp_driver (void) > > > ipa_check_create_node_params (); > > > ipa_check_create_edge_args (); > > > clone_num_suffixes = new hash_map<const char *, unsigned>; > > > + available_specializations = new hash_map<cgraph_node*, > > > vec<cgraph_node*>>; > > > > > > if (dump_file) > > > { > > > @@ -6570,8 +6671,12 @@ ipcp_driver (void) > > > ipcp_store_bits_results (); > > > /* Store results of value range propagation. */ > > > ipcp_store_vr_results (); > > > + /* Add new edges for specializations. */ > > > + if (flag_ipa_guarded_specialization) > > > + add_specialized_edges (); > > > > > > /* Free all IPCP structures. */ > > > + delete available_specializations; > > > delete clone_num_suffixes; > > > free_toporder_info (&topo); > > > delete edge_clone_summaries; > > > diff --git a/gcc/ipa-fnsummary.cc b/gcc/ipa-fnsummary.cc > > > index fd3d7d6c5e8..a1f219a056e 100644 > > > --- a/gcc/ipa-fnsummary.cc > > > +++ b/gcc/ipa-fnsummary.cc > > > @@ -257,6 +257,13 @@ redirect_to_unreachable (struct cgraph_edge *e) > > > e = cgraph_edge::resolve_speculation (e, target->decl); > > > else if (!e->callee) > > > e = cgraph_edge::make_direct (e, target); > > > + else if (e->base_specialization_edge_p ()) > > > + { > > > + /* If the base edge becomes unreachable there's no reason to > > > + keep the specializations around. */ > > > + cgraph_edge::remove_specializations (e); > > > + e->redirect_callee (target); > > > + } > > > else > > > e->redirect_callee (target); > > > class ipa_call_summary *es = ipa_call_summaries->get (e); > > > @@ -866,6 +873,7 @@ ipa_fn_summary_t::duplicate (cgraph_node *src, > > > ipa_predicate new_predicate; > > > class ipa_call_summary *es = ipa_call_summaries->get (edge); > > > next = edge->next_callee; > > > + bool update_next = edge->specialized; > > > > > > if (!edge->inline_failed) > > > inlined_to_p = true; > > > @@ -876,6 +884,9 @@ ipa_fn_summary_t::duplicate (cgraph_node *src, > > > if (new_predicate == false && *es->predicate != false) > > > optimized_out_size += es->call_stmt_size * > > > ipa_fn_summary::size_scale; > > > edge_set_predicate (edge, &new_predicate); > > > + /* NEXT may be invalidated for specialized calls. */ > > > + if (update_next) > > > + next = edge->next_callee; > > > } > > > > > > /* Remap indirect edge predicates with the same simplification as > > > above. > > > @@ -2825,6 +2836,29 @@ analyze_function_body (struct cgraph_node *node, > > > bool early) > > > es, es3); > > > } > > > } > > > + if (edge->specialized) > > > + { > > > + cgraph_edge *base > > > + = edge->specialized_call_base_edge (); > > > + ipa_call_summary *es2 > > > + = ipa_call_summaries->get_create (base); > > > + ipa_call_summaries->duplicate (edge, base, > > > + es, es2); > > > + > > > + /* Edge is the first direct call. > > > + create and duplicate call summaries for multiple > > > + speculative call targets. */ > > > + for (cgraph_edge *specialization > > > + = edge->next_specialized_call_target (); > > > + specialization; specialization > > > + = specialization->next_specialized_call_target > > > ()) > > > + { > > > + ipa_call_summary *es3 > > > + = ipa_call_summaries->get_create (specialization); > > > + ipa_call_summaries->duplicate (edge, specialization, > > > + es, es3); > > > + } > > > + } > > > } > > > > > > /* TODO: When conditional jump or switch is known to be > > > constant, but > > > @@ -3275,6 +3309,9 @@ estimate_edge_size_and_time (struct cgraph_edge *e, > > > int *size, int *min_size, > > > sreal *time, ipa_call_arg_values *avals, > > > ipa_hints *hints) > > > { > > > + if (e->guarded_specialization_edge_p ()) > > > + return; > > > + > > > class ipa_call_summary *es = ipa_call_summaries->get (e); > > > int call_size = es->call_stmt_size; > > > int call_time = es->call_stmt_time; > > > @@ -4050,6 +4087,7 @@ remap_edge_summaries (struct cgraph_edge > > > *inlined_edge, > > > { > > > ipa_predicate p; > > > next = e->next_callee; > > > + bool update_next = e->specialized; > > > > > > if (e->inline_failed) > > > { > > > @@ -4073,6 +4111,10 @@ remap_edge_summaries (struct cgraph_edge > > > *inlined_edge, > > > params_summary, callee_info, > > > operand_map, offset_map, possible_truths, > > > toplev_predicate); > > > + > > > + /* NEXT may be invalidated for specialized calls. */ > > > + if (update_next) > > > + next = e->next_callee; > > > } > > > for (e = node->indirect_calls; e; e = next) > > > { > > > diff --git a/gcc/ipa-inline-transform.cc b/gcc/ipa-inline-transform.cc > > > index 07288e57c73..d0b9cd9e599 100644 > > > --- a/gcc/ipa-inline-transform.cc > > > +++ b/gcc/ipa-inline-transform.cc > > > @@ -775,11 +775,22 @@ inline_transform (struct cgraph_node *node) > > > } > > > > > > maybe_materialize_called_clones (node); > > > + /* Perform call statement redirection in two steps. In the first step > > > + only consider speculative edges and then process the rest in a > > > separate > > > + step. This is required due to the potential existance of edges > > > that are > > > + both speculative and specialized, in which case we need to process > > > them > > > + in this order. */ > > > for (e = node->callees; e; e = next) > > > { > > > if (!e->inline_failed) > > > has_inline = true; > > > next = e->next_callee; > > > + if (e->speculative) > > > + cgraph_edge::redirect_call_stmt_to_callee (e); > > > + } > > > + for (e = node->callees; e; e = next) > > > + { > > > + next = e->next_callee; > > > cgraph_edge::redirect_call_stmt_to_callee (e); > > > } > > > node->remove_all_references (); > > > diff --git a/gcc/ipa-inline.cc b/gcc/ipa-inline.cc > > > index 14969198cde..c6cd2b92f6e 100644 > > > --- a/gcc/ipa-inline.cc > > > +++ b/gcc/ipa-inline.cc > > > @@ -1185,12 +1185,17 @@ edge_badness (struct cgraph_edge *edge, bool dump) > > > edge_time = estimate_edge_time (edge, &unspec_edge_time); > > > hints = estimate_edge_hints (edge); > > > gcc_checking_assert (edge_time >= 0); > > > + > > > + /* Temporarily disabled due to the way time is calculated > > > + with specialized edges. */ > > > +#if 0 > > > /* Check that inlined time is better, but tolerate some roundoff > > > issues. > > > FIXME: When callee profile drops to 0 we account calls more. This > > > should be fixed by never doing that. */ > > > gcc_checking_assert ((edge_time * 100 > > > - callee_info->time * 101).to_int () <= 0 > > > || callee->count.ipa ().initialized_p ()); > > > +#endif > > > gcc_checking_assert (growth <= ipa_size_summaries->get (callee)->size); > > > > > > if (dump) > > > diff --git a/gcc/lto-cgraph.cc b/gcc/lto-cgraph.cc > > > index 350195d86db..c8250f7b73c 100644 > > > --- a/gcc/lto-cgraph.cc > > > +++ b/gcc/lto-cgraph.cc > > > @@ -271,6 +271,8 @@ lto_output_edge (struct lto_simple_output_block *ob, > > > struct cgraph_edge *edge, > > > bp_pack_value (&bp, edge->speculative_id, 16); > > > bp_pack_value (&bp, edge->indirect_inlining_edge, 1); > > > bp_pack_value (&bp, edge->speculative, 1); > > > + bp_pack_value (&bp, edge->specialized, 1); > > > + bp_pack_value (&bp, edge->spec_args != NULL, 1); > > > bp_pack_value (&bp, edge->call_stmt_cannot_inline_p, 1); > > > gcc_assert (!edge->call_stmt_cannot_inline_p > > > || edge->inline_failed != CIF_BODY_NOT_AVAILABLE); > > > @@ -295,7 +297,27 @@ lto_output_edge (struct lto_simple_output_block *ob, > > > struct cgraph_edge *edge, > > > bp_pack_value (&bp, > > > edge->indirect_info->num_speculative_call_targets, > > > 16); > > > } > > > + > > > streamer_write_bitpack (&bp); > > > + > > > + if (edge->spec_args != NULL) > > > + { > > > + cgraph_specialization_info *spec_info; > > > + unsigned len = edge->spec_args->length (), i; > > > + streamer_write_uhwi_stream (ob->main_stream, len); > > > + > > > + FOR_EACH_VEC_ELT (*edge->spec_args, i, spec_info) > > > + { > > > + unsigned idx = spec_info->arg_idx; > > > + streamer_write_uhwi_stream (ob->main_stream, idx); > > > + streamer_write_hwi_stream (ob->main_stream, > > > spec_info->is_unsigned); > > > + > > > + if (spec_info->is_unsigned) > > > + streamer_write_uhwi_stream (ob->main_stream, > > > spec_info->cst.uval); > > > + else > > > + streamer_write_hwi_stream (ob->main_stream, > > > spec_info->cst.sval); > > > + } > > > + } > > > } > > > > > > /* Return if NODE contain references from other partitions. */ > > > @@ -1517,6 +1539,8 @@ input_edge (class lto_input_block *ib, > > > vec<symtab_node *> nodes, > > > > > > edge->indirect_inlining_edge = bp_unpack_value (&bp, 1); > > > edge->speculative = bp_unpack_value (&bp, 1); > > > + edge->specialized = bp_unpack_value (&bp, 1); > > > + bool has_edge_spec_args = bp_unpack_value (&bp, 1); > > > edge->lto_stmt_uid = stmt_id; > > > edge->speculative_id = speculative_id; > > > edge->inline_failed = inline_failed; > > > @@ -1542,6 +1566,28 @@ input_edge (class lto_input_block *ib, > > > vec<symtab_node *> nodes, > > > edge->indirect_info->num_speculative_call_targets > > > = bp_unpack_value (&bp, 16); > > > } > > > + > > > + if (has_edge_spec_args) > > > + { > > > + unsigned len = streamer_read_uhwi (ib); > > > + vec_alloc (edge->spec_args, len); > > > + > > > + for (unsigned i = 0; i < len; i++) > > > + { > > > + cgraph_specialization_info spec_info; > > > + spec_info.arg_idx = streamer_read_uhwi (ib); > > > + spec_info.is_unsigned = streamer_read_hwi (ib); > > > + > > > + if (spec_info.is_unsigned) > > > + spec_info.cst.uval = streamer_read_uhwi (ib); > > > + else > > > + spec_info.cst.sval = streamer_read_hwi (ib); > > > + > > > + edge->spec_args->quick_push (spec_info); > > > + } > > > + } > > > + else > > > + edge->spec_args = NULL; > > > } > > > > > > > > > diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc > > > index 8091ba8f13b..26657f7c017 100644 > > > --- a/gcc/tree-inline.cc > > > +++ b/gcc/tree-inline.cc > > > @@ -2307,6 +2307,60 @@ copy_bb (copy_body_data *id, basic_block bb, > > > indirect->count > > > = copy_basic_block->count.apply_probability > > > (prob); > > > } > > > + /* A specialized call is consist of multiple > > > + edges - a base edge and one or more specialized > > > edges. > > > + Duplicate and distribute frequencies in a way > > > similar > > > + to the speculative edges. */ > > > + else if (edge->specialized) > > > + { > > > + int n = 0; > > > + cgraph_edge *first > > > + = > > > old_edge->first_specialized_call_target (); > > > + profile_count spec_cnt > > > + = profile_count::zero (); > > > + > > > + /* First figure out the distribution of counts > > > + so we can re-scale BB profile accordingly. > > > */ > > > + for (cgraph_edge *e = first; e; > > > + e = e->next_specialized_call_target ()) > > > + spec_cnt = spec_cnt + e->count; > > > + > > > + cgraph_edge *base > > > + = old_edge->specialized_call_base_edge > > > (); > > > + profile_count base_cnt = base->count; > > > + > > > + /* Next iterate all specializations, clone them > > > + and update the profile. */ > > > + for (cgraph_edge *e = first; e; > > > + e = e->next_specialized_call_target ()) > > > + { > > > + profile_count cnt = e->count; > > > + > > > + edge = e->clone (id->dst_node, call_stmt, > > > + gimple_uid (stmt), num, > > > den, > > > + true); > > > + profile_probability prob > > > + = cnt.probability_in (spec_cnt > > > + + base_cnt); > > > + edge->count > > > + = > > > copy_basic_block->count.apply_probability > > > + (prob); > > > + n++; > > > + } > > > + > > > + /* Duplicate the base edge after all specialized > > > + edges cloned. */ > > > + base = base->clone (id->dst_node, call_stmt, > > > + gimple_uid (stmt), > > > + num, den, > > > + true); > > > + > > > + profile_probability prob > > > + = base_cnt.probability_in (spec_cnt > > > + + base_cnt); > > > + base->count > > > + = copy_basic_block->count.apply_probability > > > (prob); > > > + } > > > else > > > { > > > edge = edge->clone (id->dst_node, call_stmt, > > > diff --git a/gcc/value-prof.cc b/gcc/value-prof.cc > > > index 9656ce5870d..3db0070bcbf 100644 > > > --- a/gcc/value-prof.cc > > > +++ b/gcc/value-prof.cc > > > @@ -42,6 +42,8 @@ along with GCC; see the file COPYING3. If not see > > > #include "gimple-pretty-print.h" > > > #include "dumpfile.h" > > > #include "builtins.h" > > > +#include "tree-cfg.h" > > > +#include "tree-dfa.h" > > > > > > /* In this file value profile based optimizations are placed. Currently > > > the > > > following optimizations are implemented (for more detailed > > > descriptions > > > @@ -1434,6 +1436,218 @@ gimple_ic (gcall *icall_stmt, struct cgraph_node > > > *direct_call, > > > return dcall_stmt; > > > } > > > > > > +/* Do transformation > > > + > > > + if (arg_i == spec_args[y] && ...) > > > + do call to specialized target callee > > > + else > > > + old call > > > + */ > > > + > > > +gcall * > > > +gimple_sc (struct cgraph_edge *edg, profile_probability prob) > > > +{ > > > + /* The call statement we're modifying. */ > > > + gcall *call_stmt = edg->call_stmt; > > > + /* The cgraph_node of the specialized function. */ > > > + cgraph_node *callee = edg->callee; > > > + vec<cgraph_specialization_info> *spec_args = edg->spec_args; > > > + > > > + /* CALL_STMT should be the call_stmt of the generic function. */ > > > + gcc_checking_assert (edg->specialized_call_base_edge ()->call_stmt > > > + == call_stmt); > > > + > > > + gcall *spec_call_stmt = NULL; > > > + tree cond_tree = NULL_TREE; > > > + gcond *cond_stmt = NULL; > > > + basic_block cond_bb, dcall_bb, icall_bb, join_bb = NULL; > > > + edge e_cd, e_ci, e_di, e_dj = NULL, e_ij; > > > + gimple_stmt_iterator gsi; > > > + int lp_nr, dflags; > > > + edge e_eh, e; > > > + edge_iterator ei; > > > + > > > + cond_bb = gimple_bb (call_stmt); > > > + gsi = gsi_for_stmt (call_stmt); > > > + > > > + /* To call the specialized function we need to build a guard > > > conditional > > > + with the specialized arguments and constants. */ > > > + unsigned nargs = gimple_call_num_args (call_stmt); > > > + unsigned cur_spec = 0; > > > + bool dump_first = true; > > > + > > > + if (dump_file) > > > + { > > > + fprintf (dump_file, "Creating specialization guard for edge %s -> > > > %s:\n", > > > + edg->caller->dump_name (), > > > edg->callee->dump_name ()); > > > + fprintf (dump_file, "if ("); > > > + } > > > + > > > + for (unsigned arg_idx = 0; arg_idx < nargs; arg_idx++) > > > + { > > > + tree cur_arg = gimple_call_arg (call_stmt, arg_idx); > > > + bool cur_arg_specialized_p = cur_spec < spec_args->length () > > > + && arg_idx == (*spec_args)[cur_spec].arg_idx; > > > + > > > + if (cur_arg_specialized_p) > > > + { > > > + gcc_checking_assert (!cond_stmt); > > > + > > > + cgraph_specialization_info spec_info = (*spec_args)[cur_spec]; > > > + cur_spec++; > > > + > > > + tree spec_v; > > > + if (spec_info.is_unsigned) > > > + spec_v = build_int_cstu (integer_type_node, > > > spec_info.cst.uval); > > > + else > > > + spec_v = build_int_cst (integer_type_node, > > > spec_info.cst.sval); > > > + > > > + tree cmp_const = fold_convert (TREE_TYPE (cur_arg), spec_v); > > > + > > > + tree cur_arg_eq_spec = build2 (EQ_EXPR, boolean_type_node, > > > + cur_arg, cmp_const); > > > + > > > + if (dump_file) > > > + { > > > + if (!dump_first) > > > + fprintf (dump_file, " && "); > > > + print_generic_expr (dump_file, cur_arg_eq_spec); > > > + dump_first = false; > > > + } > > > + > > > + tree tmp1 = make_temp_ssa_name (boolean_type_node, NULL, > > > "SPEC"); > > > + gassign* load_stmt1 = gimple_build_assign (tmp1, > > > cur_arg_eq_spec); > > > + gsi_insert_before (&gsi, load_stmt1, GSI_SAME_STMT); > > > + > > > + if (!cond_tree) > > > + cond_tree = tmp1; > > > + else > > > + { > > > + tree cur_and_prev_true = fold_build2 (BIT_AND_EXPR, > > > + boolean_type_node, > > > + cond_tree, > > > + tmp1); > > > + > > > + tree tmp2 = make_temp_ssa_name (boolean_type_node, NULL, > > > "SPEC"); > > > + gassign* load_stmt2 > > > + = gimple_build_assign (tmp2, cur_and_prev_true); > > > + gsi_insert_before (&gsi, load_stmt2, GSI_SAME_STMT); > > > + cond_tree = tmp2; > > > + } > > > + } > > > + } > > > + > > > + cond_stmt = gimple_build_cond (EQ_EXPR, cond_tree, boolean_true_node, > > > + NULL_TREE, NULL_TREE); > > > + > > > + gsi_insert_before (&gsi, cond_stmt, GSI_SAME_STMT); > > > + > > > + if (gimple_vdef (call_stmt) > > > + && TREE_CODE (gimple_vdef (call_stmt)) == SSA_NAME) > > > + { > > > + unlink_stmt_vdef (call_stmt); > > > + release_ssa_name (gimple_vdef (call_stmt)); > > > + } > > > + gimple_set_vdef (call_stmt, NULL_TREE); > > > + gimple_set_vuse (call_stmt, NULL_TREE); > > > + update_stmt (call_stmt); > > > + spec_call_stmt = as_a <gcall *> (gimple_copy (call_stmt)); > > > + gimple_call_set_fndecl (spec_call_stmt, callee->decl); > > > + dflags = flags_from_decl_or_type (callee->decl); > > > + > > > + if ((dflags & ECF_NORETURN) != 0 > > > + && should_remove_lhs_p (gimple_call_lhs (spec_call_stmt))) > > > + gimple_call_set_lhs (spec_call_stmt, NULL_TREE); > > > + gsi_insert_before (&gsi, spec_call_stmt, GSI_SAME_STMT); > > > + > > > + if (dump_file) > > > + { > > > + fprintf (dump_file, ")\n "); > > > + print_gimple_stmt (dump_file, spec_call_stmt, 0); > > > + } > > > + > > > + e_cd = split_block (cond_bb, cond_stmt); > > > + dcall_bb = e_cd->dest; > > > + dcall_bb->count = cond_bb->count.apply_probability (prob); > > > + > > > + e_di = split_block (dcall_bb, spec_call_stmt); > > > + icall_bb = e_di->dest; > > > + icall_bb->count = cond_bb->count - dcall_bb->count; > > > + > > > + if (!stmt_ends_bb_p (call_stmt)) > > > + e_ij = split_block (icall_bb, call_stmt); > > > + else > > > + { > > > + e_ij = find_fallthru_edge (icall_bb->succs); > > > + if (e_ij != NULL) > > > + { > > > + e_ij->probability = profile_probability::always (); > > > + e_ij = single_pred_edge (split_edge (e_ij)); > > > + } > > > + } > > > + if (e_ij != NULL) > > > + { > > > + join_bb = e_ij->dest; > > > + join_bb->count = cond_bb->count; > > > + } > > > + > > > + e_cd->flags = (e_cd->flags & ~EDGE_FALLTHRU) | EDGE_TRUE_VALUE; > > > + e_cd->probability = prob; > > > + > > > + e_ci = make_edge (cond_bb, icall_bb, EDGE_FALSE_VALUE); > > > + e_ci->probability = prob.invert (); > > > + > > > + remove_edge (e_di); > > > + > > > + if (e_ij != NULL) > > > + { > > > + if ((dflags & ECF_NORETURN) == 0) > > > + { > > > + e_dj = make_edge (dcall_bb, join_bb, EDGE_FALLTHRU); > > > + e_dj->probability = profile_probability::always (); > > > + } > > > + e_ij->probability = profile_probability::always (); > > > + } > > > + > > > + if (gimple_call_lhs (call_stmt) > > > + && TREE_CODE (gimple_call_lhs (call_stmt)) == SSA_NAME > > > + && (dflags & ECF_NORETURN) == 0) > > > + { > > > + tree result = gimple_call_lhs (call_stmt); > > > + gphi *phi = create_phi_node (result, join_bb); > > > + gimple_call_set_lhs (call_stmt, > > > + duplicate_ssa_name (result, call_stmt)); > > > + add_phi_arg (phi, gimple_call_lhs (call_stmt), e_ij, > > > UNKNOWN_LOCATION); > > > + gimple_call_set_lhs (spec_call_stmt, > > > + duplicate_ssa_name (result, spec_call_stmt)); > > > + add_phi_arg (phi, gimple_call_lhs (spec_call_stmt), e_dj, > > > + UNKNOWN_LOCATION); > > > + } > > > + > > > + lp_nr = lookup_stmt_eh_lp (call_stmt); > > > + if (lp_nr > 0 && stmt_could_throw_p (cfun, spec_call_stmt)) > > > + { > > > + add_stmt_to_eh_lp (spec_call_stmt, lp_nr); > > > + } > > > + > > > + FOR_EACH_EDGE (e_eh, ei, icall_bb->succs) > > > + if (e_eh->flags & (EDGE_EH | EDGE_ABNORMAL)) > > > + { > > > + e = make_edge (dcall_bb, e_eh->dest, e_eh->flags); > > > + e->probability = e_eh->probability; > > > + for (gphi_iterator psi = gsi_start_phis (e_eh->dest); > > > + !gsi_end_p (psi); gsi_next (&psi)) > > > + { > > > + gphi *phi = psi.phi (); > > > + SET_USE (PHI_ARG_DEF_PTR_FROM_EDGE (phi, e), > > > + PHI_ARG_DEF_FROM_EDGE (phi, e_eh)); > > > + } > > > + } > > > + if (!stmt_could_throw_p (cfun, spec_call_stmt)) > > > + gimple_purge_dead_eh_edges (dcall_bb); > > > + return spec_call_stmt; > > > +} > > > + > > > /* Dump info about indirect call profile. */ > > > > > > static void > > > diff --git a/gcc/value-prof.h b/gcc/value-prof.h > > > index d852c41f33f..7d8be5920b9 100644 > > > --- a/gcc/value-prof.h > > > +++ b/gcc/value-prof.h > > > @@ -89,6 +89,7 @@ void verify_histograms (void); > > > void free_histograms (function *); > > > void stringop_block_profile (gimple *, unsigned int *, HOST_WIDE_INT *); > > > gcall *gimple_ic (gcall *, struct cgraph_node *, profile_probability); > > > +gcall *gimple_sc (struct cgraph_edge *, profile_probability); > > > bool get_nth_most_common_value (gimple *stmt, const char *counter_type, > > > histogram_value hist, gcov_type *value, > > > gcov_type *count, gcov_type *all, > > > -- > > > 2.38.1 > > >