Hello, and sorry for a rather late reaction. First and foremost, thanks for the patch, I think it is great to have this finally implemented and although I would like to see a couple things changed, I think the patch is quite close to what could be committed to gcc master.
After my first pass through the patch I have the following comments. I admit I have not looked into the issues with GOMP_task (though it looks like it should be just special-cased in the IPA passes but that can wait for later) and I do not have any strong opinion about what to do with the fact that clang already has the callback attribute (I guess I'd aim for a slightly different name for now). On Sun, Apr 27 2025, Josef Melcr wrote: > This patch enables constant propagation to outlined OpenMP kernels and > improves support for optimizing callback functions in general. It > implements the attribute 'callback' as found in clang, though argument > numbering is a bit different, as described below. The title says OpenMP, > but it can be used for any function which takes a callback argument, such > as pthread functions, qsort and others. > > The attribute 'callback' captures the notion of a function calling one > of its arguments with some of its parameters as arguments. An OpenMP > example of such function is GOMP_parallel. > We implement the attribute with new callgraph edges called 'callback' > edges. They are imaginary edges pointing from the caller of the function > with the attribute (e.g. caller of GOMP_parallel) to the body function > itself (e.g. the outlined OpenMP body). They share their call statement > with the edge from which they are derived (direct edge caller -> GOMP_parallel > in this case). These edges allow passes such as ipa-cp to the see the > hidden call site to the body function and optimize the function accordingly. > > To illustrate on an example, the body GOMP_parallel looks something > like this: > > void GOMP_parallel (void (*fn) (void *), void *data, /* ... */) > { > /* ... */ > fn (data); > /* ... */ > } > > > If we extend it with the attribute 'callback(1, 2)', we express that the > function calls its first argument and passes it its second argument. > This is represented in the call graph in this manner: > > direct indirect > caller -----------------> GOMP_parallel ---------------> fn > | > ----------------------> fn > callback > > The direct edge is then the parent edge, with all callback edges being > the child edges. I know this will sound like nit-picking but could we avoid using the terms parent and children? It is not just that I don't think the names capture what the edges are, but mainly I believe that (after this gets in) people reading the code and comments will not immediately think about callbacks when they encounter mentions of parents and children and will be confused. I'd suggest to call the edges "callback-carrying" and "callback" instead but I am opened to other ideas too. > While constant propagation is the main focus of this patch, callback > edges can be useful for different passes (for example, it improves icf > for OpenMP kernels), as they allow for address redirection. > If the outlined body function gets optimized and cloned, from body_fn to > body_fn.optimized, the callback edge allows us to replace the > address in the arguments list: > > GOMP_parallel (body_fn, &data_struct, /* ... */); > > becomes > > GOMP_parallel (body_fn.optimized, &data_struct, /* ... */); > > This redirection is possible for any function with the attribute. > > This callback attribute implementation is partially compatible with > clang's implementation. Its semantics, arguments and argument indexing style > are > the same, but we represent an unknown argument position with 0 > (precedent set by attributes such as 'format'), while clang uses -1 or '?'. > We also allow for multiple callback attributes on the same function, > while clang only allows one. > > The attribute allows us to propagate constants into body functions of > OpenMP constructs. Currently, GCC won't propagate the value 'c' into the > OpenMP body in the following example: > > int a[100]; > void test(int c) { > #pragma omp parallel for > for (int i = 0; i < c; i++) { > if (!__builtin_constant_p(c)) { > __builtin_abort(); > } > a[i] = i; > } > } > int main() { > test(100); > return a[5] - 5; > } > > With this patch, the body function will get cloned and the constant 'c' > will get propagated. > > Bootstrapped and regtested on x86_64-linux. OK for master? > > Thanks, > Josef Melcr > > gcc/ChangeLog: > > * builtin-attrs.def (0): New int list. > (ATTR_CALLBACK): Callback attribute identifier. > (DEF_CALLBACK_ATTRIBUTE): Macro for callback attribute creation. > (GOMP): Attributes for libgomp functions. > (OACC): Attribute used for oacc functions. > (ATTR_CALLBACK_GOMP_LIST): ATTR_NOTHROW_LIST but with the > callback attribute added, used for many libgomp functions. > (ATTR_CALLBACK_GOMP_TASK_HELPER_LIST): Helper list for the > construction of ATTR_CALLBACK_GOMP_TASK_LIST. > (ATTR_CALLBACK_GOMP_TASK_LIST): New attribute list for > GOMP_task, includes two callback attributes. > (ATTR_CALLBACK_OACC_LIST): Same as ATTR_CALLBACK_GOMP_LIST, used > for oacc builtins. > * cgraph.cc (cgraph_add_edge_to_call_site_hash): When hashing > callback edges, always hash the parent edge. > (cgraph_node::get_edge): Always return callback parent edge. > (cgraph_edge::set_call_stmt): Add cascade for callback edges. > (symbol_table::create_edge): Allow callback edges to share the > same call statement. > (cgraph_edge::make_callback): New method, derives a callback > edge this method is called on. > (cgraph_edge::get_callback_parent_edge): New method. > (cgraph_edge::first_callback_target): New method. > (cgraph_edge::next_callback_target): New method. > (cgraph_edge::purge_callback_children): New method. > (cgraph_edge::redirect_call_stmt_to_callee): Add callback edge > redirection, set call statements for child edges when updating > the parent's statement. > (cgraph_node::remove_callers): Remove child edges when removing > their parent. > (cgraph_edge::dump_edge_flags): Add dumping of callback flags. > (cgraph_edge::maybe_hot_p): Add exception for callback edges. > (cgraph_node::verify_node): Sanity checks for callback edges. > * cgraph.h: Add new cgraph_edge flags and a 16 bit hash for > identifying which attribute originated which edge. > * cgraphclones.cc (cgraph_edge::clone): Copy over callback data. > * doc/extend.texi: Add callback attribute documentation. > * ipa-cp.cc (purge_useless_callback_edges): New function. > (ipcp_decision_stage): Call purge_useless_callback_edges at the > end of the decision stage. > * ipa-fnsummary.cc (ipa_call_summary_t::duplicate): Add an > exception for callback pairs. > (analyze_function_body): Copy summary from parent to child, > update the child's summary. > * ipa-inline-analysis.cc (do_estimate_growth_1): Skip callback > edges when estimating growth. > * ipa-inline-transform.cc (inline_transform): Redirect callback > edges when redirecting their parent. > * ipa-inline.cc (can_inline_edge_p): Never inline callback > edges. > * ipa-param-manipulation.cc (drop_decl_attribute_if_params_changed_p): > New function. > (ipa_param_adjustments::build_new_function_type): Add new out > param, output info about whether args were modified. > (ipa_param_adjustments::adjust_decl): Drop callback attr when > args are modified. > * ipa-param-manipulation.h: Change signature of > build_new_function_type. > * ipa-prop.cc (ipa_duplicate_jump_function): Add declaration. > (init_callback_edge_summary): New function. > (ipa_compute_jump_functions_for_edge): Create callback edges. > * lto-cgraph.cc (lto_output_edge): Stream out callback data. > (input_edge): Input callback data. > * omp-builtins.def (BUILT_IN_GOACC_PARALLEL): Use callback > attribute. > (BUILT_IN_GOMP_PARALLEL_LOOP_STATIC): Likewise. > (BUILT_IN_GOMP_PARALLEL_LOOP_GUIDED): Likewise. > (BUILT_IN_GOMP_PARALLEL_LOOP_NONMONOTONIC_DYNAMIC): Likewise. > (BUILT_IN_GOMP_PARALLEL_LOOP_NONMONOTONIC_RUNTIME): Likewise. > (BUILT_IN_GOMP_PARALLEL): Likewise. > (BUILT_IN_GOMP_TASK): Likewise. > (BUILT_IN_GOMP_PARALLEL_SECTIONS): Likewise. > (BUILT_IN_GOMP_TEAMS_REG): Likewise. > * tree-core.h (ECF_CB_1_0): New constant for attr callback(1,0). > (ECF_CB_1_2): Constant for callback(1,2). > (ECF_CB_2_4): Constant for callback(2,4). > (ECF_CB_3_0_2): Constant for callback(3,0,2). > * tree-inline.cc (copy_bb): Copy callback edges when copying > their parent. > (redirect_all_calls): Redirect callback edges. > * tree.cc (set_call_expr_flags): Create callback attributes > according to the ECF_CB constants. > * attr-callback.h: New file. > > gcc/c-family/ChangeLog: > > * c-attribs.cc: Add callback attribute definition. > > gcc/fortran/ChangeLog: > > * f95-lang.cc (ATTR_CALLBACK_GOMP_LIST): New attr list > corresponding to the definition in builtin-attrs. > (ATTR_CALLBACK_GOMP_TASK_LIST): Likewise. > (ATTR_CALLBACK_OACC_LIST): Likewise. > > gcc/testsuite/ChangeLog: > > * gcc.dg/attr-callback.c: New test. > * gcc.dg/ipa/ipcp-cb1.c: New test. > * gcc.dg/ipa/ipcp-cb2.c: New test. > > Signed-off-by: Josef Melcr <melcr...@fit.cvut.cz> > --- > gcc/attr-callback.h | 322 +++++++++++++++++++++++++++ Another general remark is that I think you have crammed too much stuff into the above header file, including implementation of functions which are not small. In the age of LTO we should not even need to put small functions into header files now :-). I think either the implementations can go to attribs.c or to a new .cc file - but definitely to a .cc file. > gcc/builtin-attrs.def | 21 ++ > gcc/c-family/c-attribs.cc | 3 + > gcc/cgraph.cc | 266 +++++++++++++++++++++- > gcc/cgraph.h | 42 ++++ > gcc/cgraphclones.cc | 3 + > gcc/doc/extend.texi | 37 +++ > gcc/fortran/f95-lang.cc | 4 + > gcc/ipa-cp.cc | 69 +++++- > gcc/ipa-fnsummary.cc | 24 +- > gcc/ipa-inline-analysis.cc | 5 + > gcc/ipa-inline-transform.cc | 12 +- > gcc/ipa-inline.cc | 5 + > gcc/ipa-param-manipulation.cc | 36 ++- > gcc/ipa-param-manipulation.h | 2 +- > gcc/ipa-prop.cc | 86 ++++++- > gcc/lto-cgraph.cc | 6 + > gcc/omp-builtins.def | 28 +-- > gcc/testsuite/gcc.dg/attr-callback.c | 79 +++++++ > gcc/testsuite/gcc.dg/ipa/ipcp-cb1.c | 25 +++ > gcc/testsuite/gcc.dg/ipa/ipcp-cb2.c | 53 +++++ > gcc/tree-core.h | 14 ++ > gcc/tree-inline.cc | 27 ++- > gcc/tree.cc | 42 ++++ > 24 files changed, 1176 insertions(+), 35 deletions(-) > create mode 100644 gcc/attr-callback.h > create mode 100644 gcc/testsuite/gcc.dg/attr-callback.c > create mode 100644 gcc/testsuite/gcc.dg/ipa/ipcp-cb1.c > create mode 100644 gcc/testsuite/gcc.dg/ipa/ipcp-cb2.c > > diff --git a/gcc/attr-callback.h b/gcc/attr-callback.h > new file mode 100644 > index 00000000000..19abbdd09ed > --- /dev/null > +++ b/gcc/attr-callback.h > @@ -0,0 +1,322 @@ > +/* Callback attribute handling > + Copyright (C) 2025 Free Software Foundation, Inc. > + Contributed by Josef Melcr <melcr...@fit.cvut.cz> > + > + This file is part of GCC. > + > + GCC is free software; you can redistribute it and/or modify > + under the terms of the GNU General Public License as published by > + the Free Software Foundation; either version 3 of the License, or > + (at your option) any later version. > + > + GCC is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + GNU General Public License for more details. > + > + You should have received a copy of the GNU General Public License > + along with GCC; see the file COPYING3. If not see > + <http://www.gnu.org/licenses/>. */ > + > +#ifndef ATTR_CALLBACK_H > +#define ATTR_CALLBACK_H > +#include "attribs.h" > +#include "cgraph.h" > +#include "system.h" > +#include "tree.h" > +#include "function.h" > +#include "basic-block.h" > +#include "coretypes.h" > +#include "is-a.h" > +#include "predict.h" > +#include "internal-fn.h" > +#include "tree-ssa-alias.h" > +#include "gimple-expr.h" > +#include "gimple.h" > +#include "vec.h" > +#include "inchash.h" > + > +enum callback_position > +{ > + /* Value used when an argument of a callback function > + is unknown or when multiple values may be used. */ > + CB_UNKNOWN_POS = 0 > +}; > + > +/* Given an instance of callback attribute, return the 0-based > + index of the called function in question. */ > +inline int > +callback_get_fn_index (tree cb_attr) > +{ > + tree args = TREE_VALUE (cb_attr); > + int idx = TREE_INT_CST_LOW (TREE_VALUE (args)) - 1; > + return idx; > +} > + > +/* Calculates the incremental hash of the attributes arguments, narrowed > down to > + 16 bits. */ > +inline unsigned int > +callback_hash_attr (tree attr) > +{ > + inchash::hash hasher; > + tree it; > + for (it = TREE_VALUE (attr); it != NULL_TREE; it = TREE_CHAIN (it)) > + { > + unsigned int val = (unsigned int) TREE_INT_CST_LOW (TREE_VALUE (it)); > + hasher.add_int (val); > + } > + unsigned int hash = hasher.end (); > + hash &= 0xffff; > + return hash; > +} I am not convinced the hashing is a good idea, even if we really never have hash collisions (do we?), it looks like too much complication for little gain - which I understand is that we can have as many callback attributer per one callback (function pointer) parameter. At least in the first implementation, I suggest we only allow one callback attribute per parameter and identify each attribute with the index of that parameter. That is, unless you can think of some meaningful close-to-real-life case where optimizing callbacks that are called with different parameters might make sense. Moreover, the limitation can always be lifted later. > + > +/* For a given callback parent-child pair, retrieves the callback attribute > used > + * to create E from the callee of PARENT. */ > +inline tree > +callback_fetch_attr_by_edge (cgraph_edge *e, cgraph_edge *parent) > +{ > + gcc_checking_assert (e->call_stmt == parent->call_stmt > + && e->lto_stmt_uid == parent->lto_stmt_uid); > + tree cb_attr > + = lookup_attribute ("callback", DECL_ATTRIBUTES (parent->callee->decl)); > + gcc_checking_assert (cb_attr); > + tree res = NULL_TREE; > + for (; cb_attr; cb_attr = lookup_attribute ("callback", TREE_CHAIN > (cb_attr))) > + { > + unsigned hash = callback_hash_attr (cb_attr); > + if (hash == e->callback_hash) > + { > + res = cb_attr; > + break; > + } > + } > + gcc_checking_assert (res != NULL_TREE); > + return res; > +} > + > +/* Given an instance of callback attribute, return the 0-base indices > + of arguments passed to the callback. For a callback function taking > + n parameters, returns a vector of n indices of their values in the > parameter > + list of it's caller. Indices with unknown positions will be filled with > + an identity. */ I feel like I must be dumb and don't see something obvious, but how is the identity useful or even the correct thing to do? If the callback function receives some unknown parameter as, say, its third parameter, how is correct to pretend it's getting the third parameter of the callback-calling function? Or do I misunderstand the us of this function? > +inline auto_vec<int> > +callback_get_arg_mapping (cgraph_edge *e, cgraph_edge *parent) > +{ > + tree attr = callback_fetch_attr_by_edge(e, parent); > + gcc_checking_assert (attr); > + tree args = TREE_VALUE (attr); > + auto_vec<int> res; > + tree it; > + > + /* Skip over the first argument, which denotes > + which argument is the called function. */ > + for (it = TREE_CHAIN (args); it != NULL_TREE; it = TREE_CHAIN (it)) > + { > + int idx = TREE_INT_CST_LOW (TREE_VALUE (it)); > + > + /* CB_UNKNOWN_POS signifies an unknown argument, > + replace it with identity for convenience */ > + if (idx == CB_UNKNOWN_POS) > + idx = res.length (); > + /* arguments use 1-based indexing, so we have > + to subtract 1 */ > + else > + idx -= 1; > + > + res.safe_push (idx); > + } > + > + return res; > +} > + > +/* For a callback parent-child pair, returns the 0-based index of the > address of > + E's callee in the argument list of PARENT's callee decl. */ > +inline int > +callback_fetch_fn_position (cgraph_edge *e, cgraph_edge *parent) > +{ > + tree attr = callback_fetch_attr_by_edge (e, parent); > + return callback_get_fn_index (attr); > +} > + > +/* Returns the element at index idx in the list or NULL_TREE if > + the list isn't long enough. NULL_TREE is used as the endpoint. */ > +static tree > +get_nth_list_elem (tree list, unsigned idx) > +{ > + tree res = NULL_TREE; > + unsigned i = 0; > + tree it; > + for (it = list; it != NULL_TREE; it = TREE_CHAIN (it), i++) > + { > + if (i == idx) > + { > + res = TREE_VALUE (it); > + break; > + } > + } > + return res; > +} > + > +/* Handle a "callback" attribute; arguments as in > + struct attribute_spec.handler. */ > +inline tree > +handle_callback_attribute (tree *node, tree name, tree args, > + int ARG_UNUSED (flags), bool *no_add_attrs) > +{ > + tree decl = *node; > + if (TREE_CODE (decl) != FUNCTION_DECL) > + { > + error_at (DECL_SOURCE_LOCATION (decl), > + "%qE attribute can only be used on functions", name); > + *no_add_attrs = true; > + } > + > + tree cb_fn_idx_node = TREE_VALUE (args); > + if (TREE_CODE (cb_fn_idx_node) != INTEGER_CST) > + { > + error_at (DECL_SOURCE_LOCATION (decl), > + "argument specifying callback function position is not an " > + "integer constant"); > + *no_add_attrs = true; > + return NULL_TREE; > + } > + /* We have to use the function type for validation, as > + DECL_ARGUMENTS returns NULL at this point. */ > + unsigned callback_fn_idx = TREE_INT_CST_LOW (cb_fn_idx_node); > + tree decl_type_args = TYPE_ARG_TYPES (TREE_TYPE (decl)); > + tree it; > + unsigned decl_nargs = list_length (decl_type_args); > + for (it = decl_type_args; it != NULL_TREE; it = TREE_CHAIN (it)) > + if (it == void_list_node) > + { > + --decl_nargs; > + break; > + } > + if (callback_fn_idx == CB_UNKNOWN_POS) > + { > + error_at (DECL_SOURCE_LOCATION (decl), > + "callback function position cannot be marked as unknown"); > + *no_add_attrs = true; > + return NULL_TREE; > + } > + --callback_fn_idx; > + if (callback_fn_idx >= decl_nargs) > + { > + error_at (DECL_SOURCE_LOCATION (decl), > + "callback function position out of range"); > + *no_add_attrs = true; > + return NULL_TREE; > + } > + > + /* Search for the type of the callback function > + in parameters of the original function. */ > + tree cfn = get_nth_list_elem(decl_type_args, callback_fn_idx); > + if (cfn == NULL_TREE) > + { > + error_at (DECL_SOURCE_LOCATION (decl), > + "could not retrieve callback function from arguments"); > + *no_add_attrs = true; > + return NULL_TREE; > + } > + tree cfn_pointee_type = TREE_TYPE (cfn); > + if (TREE_CODE (cfn) != POINTER_TYPE > + || TREE_CODE (cfn_pointee_type) != FUNCTION_TYPE) > + { > + error_at (DECL_SOURCE_LOCATION (decl), > + "argument no. %d is not an address of a function", > + callback_fn_idx + 1); > + *no_add_attrs = true; > + return NULL_TREE; > + } > + > + tree type_args = TYPE_ARG_TYPES (cfn_pointee_type); > + /* Compare the length of the list of argument indices > + and the real number of parameters the callback takes. */ > + unsigned cfn_nargs = list_length (TREE_CHAIN (args)); > + unsigned type_nargs = list_length (type_args); > + for (it = type_args; it != NULL_TREE; it = TREE_CHAIN (it)) > + if (it == void_list_node) > + { > + --type_nargs; > + break; > + } > + if (cfn_nargs != type_nargs) > + { > + error_at (DECL_SOURCE_LOCATION (decl), > + "argument number mismatch, %d expected, got %d", type_nargs, > + cfn_nargs); > + *no_add_attrs = true; > + return NULL_TREE; > + } > + > + unsigned curr = 0; > + tree cfn_it; > + /* Validate type compatibility of the arguments passed > + from caller function to callback. "it" is used to step > + through the parameters of the caller, "cfn_it" is > + stepping through the parameters of the callback. */ > + for (it = type_args, cfn_it = TREE_CHAIN (args); curr < type_nargs; > + it = TREE_CHAIN (it), cfn_it = TREE_CHAIN (cfn_it), curr++) > + { > + if (TREE_CODE (TREE_VALUE (cfn_it)) != INTEGER_CST) > + { > + error_at (DECL_SOURCE_LOCATION (decl), > + "argument no. %d is not an integer constant", curr + 1); > + *no_add_attrs = true; > + continue; > + } > + > + unsigned arg_idx = TREE_INT_CST_LOW (TREE_VALUE (cfn_it)); > + > + /* No need to check for type compatibility, > + if we don't know what we are passing. */ > + if (arg_idx == CB_UNKNOWN_POS) > + { > + continue; > + } > + > + arg_idx -= 1; > + /* Report an error if the position is out of bounds, > + but we can still check the rest of the arguments. */ > + if (arg_idx >= decl_nargs) > + { > + error_at (DECL_SOURCE_LOCATION (decl), > + "callback argument index %d is out of range", arg_idx + 1); > + *no_add_attrs = true; > + continue; > + } > + > + tree arg_type = get_nth_list_elem (decl_type_args, arg_idx); > + tree expected_type = TREE_VALUE (it); > + /* Check the type of the value we are about to pass ("arg_type") > + for compatibility with the actual type the callback function > + expects ("expected_type"). */ > + if (!types_compatible_p (expected_type, arg_type)) > + { > + error_at (DECL_SOURCE_LOCATION (decl), > + "argument type at index %d is not compatible with callback " > + "argument type at index %d", > + arg_idx + 1, curr + 1); > + *no_add_attrs = true; > + continue; > + } > + } > + > + return NULL_TREE; > +} > + > +/* Returns TRUE if E is considered useful in the callgraph, FALSE otherwise. > If > + * this predicate returns FALSE, then E wasn't used to optimize its callee > and > + * can be safely removed from the callgraph. */ > +inline bool > +callback_edge_useful_p (cgraph_edge *e) > +{ > + gcc_checking_assert (e->callback); > + /* If the edge is not pointing towards a clone, it is no longer useful as > its > + entire purpose is to produce clones of callbacks. */ > + if (!e->callee->clone_of) > + return false; > + return true; > +} > + > +#endif /* ATTR_CALLBACK_H */ > diff --git a/gcc/builtin-attrs.def b/gcc/builtin-attrs.def > index 850efea11ca..f6043747773 100644 > --- a/gcc/builtin-attrs.def > +++ b/gcc/builtin-attrs.def > @@ -75,6 +75,7 @@ DEF_ATTR_FOR_STRING (STRERRNOP, ".P") > #define DEF_LIST_INT_INT(VALUE1, VALUE2) \ > DEF_ATTR_TREE_LIST (ATTR_LIST_##VALUE1##_##VALUE2, ATTR_NULL, > \ > ATTR_##VALUE1, ATTR_LIST_##VALUE2) > +DEF_LIST_INT_INT (0,2) > DEF_LIST_INT_INT (1,0) > DEF_LIST_INT_INT (1,2) > DEF_LIST_INT_INT (1,3) > @@ -122,6 +123,7 @@ DEF_ATTR_IDENT (ATTR_TM_TMPURE, "transaction_pure") > DEF_ATTR_IDENT (ATTR_RETURNS_TWICE, "returns_twice") > DEF_ATTR_IDENT (ATTR_RETURNS_NONNULL, "returns_nonnull") > DEF_ATTR_IDENT (ATTR_WARN_UNUSED_RESULT, "warn_unused_result") > +DEF_ATTR_IDENT (ATTR_CALLBACK, "callback") > > DEF_ATTR_TREE_LIST (ATTR_NOVOPS_LIST, ATTR_NOVOPS, ATTR_NULL, ATTR_NULL) > > @@ -416,6 +418,25 @@ DEF_FORMAT_ATTRIBUTE_NOTHROW(STRFMON,3,3_4) > #undef DEF_FORMAT_ATTRIBUTE_NOTHROW > #undef DEF_FORMAT_ATTRIBUTE_BOTH > > +/* Callback attr */ > +#define DEF_CALLBACK_ATTRIBUTE(TYPE, CA, VALUES) \ > + DEF_ATTR_TREE_LIST (ATTR_CALLBACK_##TYPE##_##CA##_##VALUES, ATTR_CALLBACK,\ > + ATTR_##CA, ATTR_LIST_##VALUES) > + > +DEF_CALLBACK_ATTRIBUTE(GOMP, 1, 0) > +DEF_CALLBACK_ATTRIBUTE(GOMP, 1, 2) > +DEF_CALLBACK_ATTRIBUTE(OACC, 2, 4) > +DEF_CALLBACK_ATTRIBUTE(GOMP, 3, 0_2) > +DEF_ATTR_TREE_LIST(ATTR_CALLBACK_GOMP_LIST, ATTR_CALLBACK, > + > ATTR_CALLBACK_GOMP_1_2, ATTR_NOTHROW_LIST) > +DEF_ATTR_TREE_LIST(ATTR_CALLBACK_GOMP_TASK_HELPER_LIST, ATTR_CALLBACK, > + > ATTR_CALLBACK_GOMP_1_0, ATTR_NOTHROW_LIST) > +DEF_ATTR_TREE_LIST(ATTR_CALLBACK_GOMP_TASK_LIST, ATTR_CALLBACK, > + > ATTR_CALLBACK_GOMP_3_0_2, ATTR_CALLBACK_GOMP_TASK_HELPER_LIST) > +DEF_ATTR_TREE_LIST(ATTR_CALLBACK_OACC_LIST, ATTR_CALLBACK, > + > ATTR_CALLBACK_OACC_2_4, ATTR_NOTHROW_LIST) > +#undef DEF_CALLBACK_ATTRIBUTE > + > /* Transactional memory variants of the above. */ > > DEF_ATTR_TREE_LIST (ATTR_TM_NOTHROW_LIST, > diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc > index 5a0e3d328ba..d88faf69544 100644 > --- a/gcc/c-family/c-attribs.cc > +++ b/gcc/c-family/c-attribs.cc > @@ -49,6 +49,7 @@ along with GCC; see the file COPYING3. If not see > #include "tree-pretty-print.h" > #include "gcc-rich-location.h" > #include "gcc-urlifier.h" > +#include "attr-callback.h" > > static tree handle_packed_attribute (tree *, tree, tree, int, bool *); > static tree handle_nocommon_attribute (tree *, tree, tree, int, bool *); > @@ -465,6 +466,8 @@ const struct attribute_spec c_common_gnu_attributes[] = > handle_tm_attribute, NULL }, > { "transaction_may_cancel_outer", 0, 0, false, true, false, false, > handle_tm_attribute, NULL }, > + { "callback", 1, -1, true, false, false, false, > + handle_callback_attribute, NULL}, > /* ??? These two attributes didn't make the transition from the > Intel language document to the multi-vendor language document. */ > { "transaction_pure", 0, 0, false, true, false, false, > diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc > index 6ae6a97f6f5..ee8ebe04e73 100644 > --- a/gcc/cgraph.cc > +++ b/gcc/cgraph.cc > @@ -69,6 +69,7 @@ along with GCC; see the file COPYING3. If not see > #include "tree-nested.h" > #include "symtab-thunks.h" > #include "symtab-clones.h" > +#include "attr-callback.h" > > /* FIXME: Only for PROP_loops, but cgraph shouldn't have to know about this. > */ > #include "tree-pass.h" > @@ -720,11 +721,21 @@ cgraph_add_edge_to_call_site_hash (cgraph_edge *e) > one indirect); always hash the direct one. */ > if (e->speculative && e->indirect_unknown_callee) > return; > + /* We always want to hash the parent edge of a callback, not the edges > + pointing to the callbacks themselves, as their call statement doesn't > + exist. */ > + if (e->callback) > + return; > cgraph_edge **slot = e->caller->call_site_hash->find_slot_with_hash > (e->call_stmt, cgraph_edge_hasher::hash (e->call_stmt), INSERT); > if (*slot) > { > - gcc_assert (((cgraph_edge *)*slot)->speculative); > + cgraph_edge *edge = (cgraph_edge *) *slot; > + gcc_assert (edge->speculative || edge->has_callback); > + if (edge->has_callback) > + /* If the slot is already occupied, then the hashed edge is the parent, > + which is desired behavior, so we can safely return. */ I wonder if this can actually happen when we early return for callback functions above before we try to add them. (I sort of wonder about the speculative case too but I guess that edges can stop being indirect which then would make a difference.) But perhaps this is OK and I'm missing some case where this can indeed happen. > + return; > if (e->callee && (!e->prev_callee > || !e->prev_callee->speculative > || e->prev_callee->call_stmt != e->call_stmt)) > @@ -768,6 +779,13 @@ cgraph_node::get_edge (gimple *call_stmt) > n++; > } > > + /* We want to work with the parent edge whenever possible. When it comes to > + callback edges, a call statement might have multiple callback edges > + attached to it. These can be easily obtained from the parent edge > instead. > + */ > + if (e && e->callback) > + e = e->get_callback_parent_edge (); > + > if (n > 100) > { > call_site_hash = hash_table<cgraph_edge_hasher>::create_ggc (120); > @@ -837,8 +855,31 @@ cgraph_edge::set_call_stmt (cgraph_edge *e, gcall > *new_stmt, > return e_indirect ? indirect : direct; > } > > - if (new_direct_callee) > - e = make_direct (e, new_direct_callee); Was this removal intentional? > + /* Callback edges also need their call stmts changed. > + We can use the same flag as for speculative edges. */ If by same flag you mean the update_speculative parameter than that parameter IMHO should be renamed. > + if (update_speculative && (e->callback || e->has_callback)) > + { > + cgraph_edge *current, *next; > + > + current = e->first_callback_target (); > + if (current) > + { > + gcall *old_stmt = current->call_stmt; > + for (cgraph_edge *d = current; d; d = next) > + { > + next = d->next_callee; > + for (; next; next = next->next_callee) > + { > + /* has_callback doesn't need to checked, as their > + call statements wouldn't match */ I'm afraid I don't understand this comment. > + if (next->callback && old_stmt == next->call_stmt) > + break; > + } > + cgraph_edge *d2 = set_call_stmt (d, new_stmt, false); > + gcc_assert (d2 == d); > + } > + } > + } > > /* Only direct speculative edges go to call_site_hash. */ > if (e->caller->call_site_hash > @@ -885,7 +926,7 @@ symbol_table::create_edge (cgraph_node *caller, > cgraph_node *callee, > construction of call stmt hashtable. */ > cgraph_edge *e; > gcc_checking_assert (!(e = caller->get_edge (call_stmt)) > - || e->speculative); > + || e->speculative || e->has_callback || e->callback); > > gcc_assert (is_gimple_call (call_stmt)); > } > @@ -912,6 +953,9 @@ symbol_table::create_edge (cgraph_node *caller, > cgraph_node *callee, > edge->indirect_info = NULL; > edge->indirect_inlining_edge = 0; > edge->speculative = false; > + edge->has_callback = false; > + edge->callback = false; > + edge->callback_hash = 0; > edge->indirect_unknown_callee = indir_unknown_callee; > if (call_stmt && caller->call_site_hash) > cgraph_add_edge_to_call_site_hash (edge); > @@ -1135,6 +1179,117 @@ cgraph_edge::make_speculative (cgraph_node *n2, > profile_count direct_count, > return e2; > } > > +/* Turn edge into a callback edge calling N2. Callback edges > + never get turned into actual calls, they are just used > + as clues and allow for optimizing functions which do not > + have any callsites during compile time, e.g. functions > + passed to standard library functions. I guess that the comment of this (and previous, pre-existing) function should say something like "create a speculative/callback edge", rather than "turn this edge into..." > + > + The edge will be attached to the same call statement as > + it's parent, which is the instance this method is called on. > + > + callback_hash is used to pair the returned edge with the attribute that > + originated it. > + > + Return the resulting callback edge. */ > + > +cgraph_edge * > +cgraph_edge::make_callback (cgraph_node *n2, unsigned int callback_hash) > +{ > + cgraph_node *n = caller; > + cgraph_edge *e2; > + > + has_callback = true; > + e2 = n->create_edge (n2, call_stmt, count); > + if (dump_file) > + fprintf (dump_file, > + "Created callback edge %s -> %s belonging to parent %s -> %s\n", > + e2->caller->dump_name (), e2->callee->dump_name (), > + caller->name (), callee->name ()); Please use dump_name in dumps (unless there is some specific reason not to). > + initialize_inline_failed (e2); > + e2->callback = true; > + e2->callback_hash = callback_hash; > + if (TREE_NOTHROW (n2->decl)) > + e2->can_throw_external = false; > + else > + e2->can_throw_external = can_throw_external; > + e2->lto_stmt_uid = lto_stmt_uid; > + n2->mark_address_taken (); > + return e2; > +} > + > +/* Returns the parent edge of a callback edge on which > + it is called on or NULL when no such edge can be found. > + > + An edge is taken to be a parent if it has it's has_callback > + flag set and the edges share their call statements. */ > + > +cgraph_edge * > +cgraph_edge::get_callback_parent_edge () > +{ > + gcc_checking_assert (callback); > + cgraph_edge *e; > + for (e = caller->callees; e; e = e->next_callee) > + { > + if (e->has_callback && e->call_stmt == call_stmt > + && e->lto_stmt_uid == lto_stmt_uid) > + break; > + } > + return e; > +} > + > +/* Returns the first callback edge in the list of callees of the caller node. > + Note that the edges might be in arbitrary order. Must be called on a > + callback or parent edge. */ > +cgraph_edge * > +cgraph_edge::first_callback_target () Why is this called a "target?" Perhaps something like "first_callback_edge" would be better? Sorry about the dwelling on naming so much, but I think this one makes the code more difficult to read (or I am missing something again). > +{ > + gcc_checking_assert (has_callback || callback); > + cgraph_edge *e = NULL; > + for (e = caller->callees; e; e = e->next_callee) > + { > + if (e->callback && e->call_stmt == call_stmt > + && e->lto_stmt_uid == lto_stmt_uid) > + { > + break; > + } > + } > + return e; > +} > + > +/* Given a callback edge, returns the next callback edge belonging to the > same > + parent. Must be called on a callback edge, not the parent.*/ > +cgraph_edge * > +cgraph_edge::next_callback_target () > +{ > + gcc_checking_assert (callback); > + cgraph_edge *e = NULL; > + for (e = next_callee; e; e = e->next_callee) > + { > + if (e->callback && e->call_stmt == call_stmt > + && e->lto_stmt_uid == lto_stmt_uid) > + { > + break; > + } > + } > + return e; > +} > + > +/* When called on a callback parent edge, removes all of its child edges and > + sets has_callback to FALSE. */ > +void > +cgraph_edge::purge_callback_children () > +{ > + gcc_checking_assert (has_callback); > + cgraph_edge *e, *next; > + for (e = first_callback_target (); e; e = next) > + { > + next = e->next_callback_target (); > + cgraph_edge::remove (e); > + } > + has_callback = false; > +} I know I wrote I wanted the hash removed but is it not necessary to clear it here, given that verify_node checks it is zero when not needed? > + > /* Speculative call consists of an indirect edge and one or more > direct edge+ref pairs. > > @@ -1494,6 +1649,24 @@ cgraph_edge::redirect_call_stmt_to_callee (cgraph_edge > *e, > || decl == e->callee->decl) > return e->call_stmt; > > + /* When redirecting a callback edge, all we need to do is replace > + the original address with the address of the function we are > + redirecting to. */ > + if (e->callback) > + { > + cgraph_edge *parent = e->get_callback_parent_edge (); > + if (!lookup_attribute ("callback", > + DECL_ATTRIBUTES (parent->callee->decl))) > + /* Callback attribute is removed if the offloading function changes > + signature, as the indices would be correct anymore. These edges will > + get cleaned up later, ignore their redirection for now. */ > + return e->call_stmt; > + int fn_idx > + = callback_fetch_fn_position (e, parent); > + gimple_call_set_arg (e->call_stmt, fn_idx, build_addr > (e->callee->decl)); I think you want build_fold_addr_expr_loc instead of build_addr here. > + return e->call_stmt; > + } > + > if (decl && ipa_saved_clone_sources) > { > tree *p = ipa_saved_clone_sources->get (e->callee); > @@ -1603,7 +1776,9 @@ cgraph_edge::redirect_call_stmt_to_callee (cgraph_edge > *e, > maybe_remove_unused_call_args (DECL_STRUCT_FUNCTION (e->caller->decl), > new_stmt); > > - e->caller->set_call_stmt_including_clones (e->call_stmt, new_stmt, false); > + /* Update callback child edges if setting the parent's statement, or else > + their their pairing would fall apart. */ > + e->caller->set_call_stmt_including_clones (e->call_stmt, new_stmt, > e->has_callback); > > if (symtab->dump_file) > { > @@ -1782,6 +1957,18 @@ cgraph_node::remove_callers (void) > for (e = callers; e; e = f) > { > f = e->next_caller; > + /* When removing a parent edge, remove all its child edges as well. */ > + if (e->has_callback) > + { > + cgraph_edge *cbe, *next_cbe = NULL; > + for (cbe = e->first_callback_target (); cbe; cbe = next_cbe) > + { > + next_cbe = cbe->next_callback_target (); > + symtab->call_edge_removal_hooks (cbe); > + cbe->remove_caller (); > + symtab->free_edge (cbe); > + } > + } > symtab->call_edge_removal_hooks (e); > e->remove_caller (); > symtab->free_edge (e); > @@ -2091,6 +2278,10 @@ cgraph_edge::dump_edge_flags (FILE *f) > { > if (speculative) > fprintf (f, "(speculative) "); > + if (callback) > + fprintf (f, "(callback) "); > + if (has_callback) > + fprintf (f, "(has_callback) "); > if (!inline_failed) > fprintf (f, "(inlined) "); > if (call_stmt_cannot_inline_p) > @@ -2989,6 +3180,10 @@ cgraph_edge::cannot_lead_to_return_p (void) > bool > cgraph_edge::maybe_hot_p (void) > { > + /* TODO: Always consider callback hot, otherwise they would never get > cloned. > + This can be changed after ipa-cp heuristics get fixed. */ > + if (callback) > + return true; > if (!maybe_hot_count_p (NULL, count.ipa ())) > return false; > if (caller->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED > @@ -3656,6 +3851,8 @@ cgraph_node::verify_node (void) > if (gimple_has_body_p (e->caller->decl) > && !e->caller->inlined_to > && !e->speculative > + && !e->callback > + && !e->has_callback > /* Optimized out calls are redirected to __builtin_unreachable. */ > && (e->count.nonzero_p () > || ! e->callee->decl > @@ -3861,7 +4058,12 @@ cgraph_node::verify_node (void) > } > if (!e->indirect_unknown_callee) > { > - if (e->verify_corresponds_to_fndecl (decl)) > + /* Callback edges violate this assertion > + because their call statement doesn't exist, > + their associated statement belongs to the > + offloading function. */ > + if (!e->callback > + && e->verify_corresponds_to_fndecl (decl)) > { > error ("edge points to wrong declaration:"); > debug_tree (e->callee->decl); > @@ -3903,7 +4105,57 @@ cgraph_node::verify_node (void) > > for (e = callees; e; e = e->next_callee) > { > - if (!e->aux && !e->speculative) > + if (!e->callback && e->callback_hash) > + { > + error ("non-callback edge has callback_hash set"); > + error_found = true; > + } > + > + if (e->callback && e->has_callback) > + { > + error ("edge has both callback and has_callback set"); > + error_found = true; > + } > + > + if (e->callback) > + { > + if (!e->get_callback_parent_edge ()) > + { > + error ("callback edge %s->%s has no parent", > + identifier_to_locale (e->caller->name ()), > + identifier_to_locale (e->callee->name ())); > + error_found = true; > + } > + } > + > + if (e->has_callback) > + { > + int ncallbacks = 0; > + int nfound_edges = 0; > + for (tree cb = lookup_attribute ("callback", DECL_ATTRIBUTES ( > + e->callee->decl)); > + cb; cb = lookup_attribute ("callback", TREE_CHAIN (cb)), > + ncallbacks++) > + ; > + for (cgraph_edge *cbe = callees; cbe; cbe = cbe->next_callee) > + { > + if (cbe->callback && cbe->call_stmt == e->call_stmt > + && cbe->lto_stmt_uid == e->lto_stmt_uid) > + { > + nfound_edges++; > + } > + } > + if (ncallbacks < nfound_edges) > + { > + error ("callback edge %s->%s child edge count mismatch, " > + "expected at most %d, found %d", > + identifier_to_locale (e->caller->name ()), > + identifier_to_locale (e->callee->name ()), ncallbacks, > + nfound_edges); > + } > + } > + > + if (!e->aux && !e->speculative && !e->callback && !e->has_callback) > { > error ("edge %s->%s has no corresponding call_stmt", > identifier_to_locale (e->caller->name ()), > diff --git a/gcc/cgraph.h b/gcc/cgraph.h > index abde770ba2b..cc12ed0c97c 100644 > --- a/gcc/cgraph.h > +++ b/gcc/cgraph.h > @@ -1736,6 +1736,31 @@ public: > cgraph_edge *make_speculative (cgraph_node *n2, profile_count direct_count, > unsigned int speculative_id = 0); > > + /* Turns edge into a callback edge, representing an indirect call to n2 > + passed to a function by argument. Sets has_callback flag of the original > + edge. Both edges are attached to the same call statement. Returns > created > + callback edge. */ > + cgraph_edge *make_callback (cgraph_node *n2, unsigned int callback_hash); > + > + /* Returns the parent edge of a callback edge or NULL, if such edge > + cannot be found. An edge is considered a parent, if it has it's > + has_callback flag set and shares it's call statement with the edge > + this method is caled on. */ > + cgraph_edge *get_callback_parent_edge (); > + > + /* Returns the first callback edge in the list of callees of the caller > node. > + Note that the edges might be in arbitrary order. Must be called on a > + callback or parent edge. */ > + cgraph_edge *first_callback_target (); > + > + /* Given a callback edge, returns the next callback edge belonging to the > same > + parent. Must be called on a callback edge, not the parent.*/ > + cgraph_edge *next_callback_target (); > + > + /* When called on a callback parent edge, removes all of its child edges > and > + sets has_callback to FALSE. */ > + void purge_callback_children (); > + > /* Speculative call consists of an indirect edge and one or more > direct edge+ref pairs. Speculative will expand to the following > sequence: > > @@ -1952,6 +1977,23 @@ public: > Optimizers may later redirect direct call to clone, so 1) and 3) > do not need to necessarily agree with destination. */ > unsigned int speculative : 1; > + /* Edges with CALLBACK flag represent indirect calls to functions passed > + to their callers by argument. This is useful in cases, where the body > + of these caller functions is not known, e. g. qsort in glibc or > + GOMP_parallel in libgomp. These edges are never made into real calls, > + but are used instead to optimize these callback functions and later > replace > + their addresses with their optimized versions. Edges with this flag set > + share their call statement with their parent edge. */ > + unsigned int callback : 1; > + /* Edges with this flag set have one or more child callabck edges. They > share > + their call statements with this edge. This flag represents the fact that > + the callee of this edge takes a function and it's parameters by argument > + and calls it at a later time. */ > + unsigned int has_callback : 1; > + /* Hash calculated from arguments of a callback attribute. Used to pair > + callback edges and the attributes that originated them together. Needed > + in order to get ipa-icf to work with callbacks. */ > + unsigned int callback_hash : 16; > /* Set to true when caller is a constructor or destructor of polymorphic > type. */ > unsigned in_polymorphic_cdtor : 1; > diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc > index e6223fa1f5c..8063ba77536 100644 > --- a/gcc/cgraphclones.cc > +++ b/gcc/cgraphclones.cc > @@ -144,6 +144,9 @@ cgraph_edge::clone (cgraph_node *n, gcall *call_stmt, > unsigned stmt_uid, > new_edge->can_throw_external = can_throw_external; > new_edge->call_stmt_cannot_inline_p = call_stmt_cannot_inline_p; > new_edge->speculative = speculative; > + new_edge->callback = callback; > + new_edge->has_callback = has_callback; > + new_edge->callback_hash = callback_hash; > new_edge->in_polymorphic_cdtor = in_polymorphic_cdtor; > > /* Update IPA profile. Local profiles need no updating in original. */ > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi > index 0978c4c41b2..f23fe11d9fd 100644 > --- a/gcc/doc/extend.texi > +++ b/gcc/doc/extend.texi > @@ -1970,6 +1970,43 @@ declares that @code{my_alloc1} returns 16-byte aligned > pointers and > that @code{my_alloc2} returns a pointer whose value modulo 32 is equal > to 8. > > +@cindex @code{callback} function attribute > +@item callback (@var{function-pos}, @var{...}) > +The @code{callback} attribute specifies that a function takes a pointer to > +a callback function as one of it's parameters and passes it some of it's own > +parameters. For example: > + > +@smallexample > +void foo(void (*fn)(int*), int *data) __attribute__((callback(1, 2))); > +@end smallexample > + > +where body of @code{foo} looks something like: > + > +@smallexample > +void foo(void (*fn)(int*), int *data) > +@{ > + ... > + fn(data); > + ... > +@} > +@end smallexample > + > +This is particuarly useful for cases, where body of functions with callbacks > +is unknown at compile-time. Using this attribute allows GCC to perform > +optimizations on the callback function, namely constant propagation. > +The parameter @var{function-pos} specifies the position of the pointer > +to the callback function. All indices start from 1. This parameter should be > +followed by @var{n} positions of arguments passed to the callback function > +(where @var{n} is the number of arguments the callback function takes) in > order > +in which they are passed. Value 0 should be used in places where the position > +for a given argument is unknown or the value is not passed through the > caller. > +When used with non-static C++ methods, all indices should start at 2, since > the > +first argument is implicit @code{this}. > + > +In the example above, function @code{foo} takes it's callback function as > it's > +first argument and passes it it's second argument, so the correct values of > +parameters are 1 and 2. > + > @cindex @code{cold} function attribute > @item cold > The @code{cold} attribute on functions is used to inform the compiler that > diff --git a/gcc/fortran/f95-lang.cc b/gcc/fortran/f95-lang.cc > index 1f09553142d..009701faa3f 100644 > --- a/gcc/fortran/f95-lang.cc > +++ b/gcc/fortran/f95-lang.cc > @@ -580,6 +580,10 @@ gfc_builtin_function (tree decl) > #define ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST \ > (ECF_COLD | ECF_NORETURN | \ > ECF_NOTHROW | ECF_LEAF) > +#define ATTR_CALLBACK_GOMP_LIST (ECF_CB_1_2 | ATTR_NOTHROW_LIST) > +#define ATTR_CALLBACK_GOMP_TASK_LIST \ > + (ECF_CB_3_0_2 | ECF_CB_1_0 | ATTR_NOTHROW_LIST) > +#define ATTR_CALLBACK_OACC_LIST (ECF_CB_2_4 | ATTR_NOTHROW_LIST) > > static void > gfc_define_builtin (const char *name, tree type, enum built_in_function code, > diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc > index b4b96997d75..c706a3195b6 100644 > --- a/gcc/ipa-cp.cc > +++ b/gcc/ipa-cp.cc > @@ -131,7 +131,7 @@ along with GCC; see the file COPYING3. If not see > #include "dbgcnt.h" > #include "symtab-clones.h" > #include "gimple-range.h" > - > +#include "attr-callback.h" > > /* Allocation pools for values and their sources in ipa-cp. */ > > @@ -6241,6 +6241,68 @@ identify_dead_nodes (struct cgraph_node *node) > } > } > > +/* Removes all useless callback edges from the callgraph. Useless callback > edges > + might mess up the callgraph, because they might be impossible to redirect > and > + so on, leading to crashes. Their usefulness is evaluated through > + callback_edge_useful_p*/ > +static void > +purge_useless_callback_edges () > +{ > + if (dump_file) > + fprintf (dump_file, "\nPurging useless callback edges:\n"); > + > + cgraph_edge *e; > + cgraph_node *node; > + FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node) > + { > + for (e = node->callees; e; e = e->next_callee) > + { > + if (e->has_callback) > + { > + if (dump_file) > + fprintf (dump_file, "\tExamining children of edge %s -> %s:\n", > + e->caller->name (), e->callee->name ()); > + if (!lookup_attribute ("callback", > + DECL_ATTRIBUTES (e->callee->decl))) > + { > + if (dump_file) > + fprintf ( > + dump_file, > + "\t\tPurging children, because the offloading " > + "function no longer has any callback attributes.\n"); > + e->purge_callback_children (); > + continue; > + } > + cgraph_edge *cbe, *next; > + for (cbe = e->first_callback_target (); cbe; cbe = next) > + { > + next = cbe->next_callback_target (); > + if (!callback_edge_useful_p (cbe)) > + { > + if (dump_file) > + fprintf (dump_file, > + "\t\tCallback edge %s -> %s not deemed " > + "useful, removing.\n", > + cbe->caller->name (), cbe->callee->name ()); > + cgraph_edge::remove (cbe); > + } > + else > + { > + if (dump_file) > + fprintf (dump_file, > + "\t\tNot considering callback edge %s -> %s " > + "for deletion.\n", > + cbe->caller->name (), cbe->callee->name ()); Please reword to something like "kept because it looks useful" rather than "not deleting." Thanks. > + } > + } > + } > + } > + } > + > + if (dump_file) > + fprintf (dump_file, "\n"); > +} > + > /* The decision stage. Iterate over the topological order of call graph > nodes > TOPO and make specialized clones if deemed beneficial. */ > > @@ -6271,6 +6333,11 @@ ipcp_decision_stage (class ipa_topo_info *topo) > if (change) > identify_dead_nodes (node); > } > + > + /* Currently, the primary use of callback edges is constant propagation. > + Constant propagation is now over, so we have to remove unused callback > + edges. */ > + purge_useless_callback_edges (); > } > > /* Look up all VR and bits information that we have discovered and copy it > diff --git a/gcc/ipa-fnsummary.cc b/gcc/ipa-fnsummary.cc > index 4c062fe8a0e..fb854fa65db 100644 > --- a/gcc/ipa-fnsummary.cc > +++ b/gcc/ipa-fnsummary.cc > @@ -990,7 +990,10 @@ ipa_call_summary_t::duplicate (struct cgraph_edge *src, > info->predicate = NULL; > edge_set_predicate (dst, srcinfo->predicate); > info->param = srcinfo->param.copy (); > - if (!dst->indirect_unknown_callee && src->indirect_unknown_callee) > + if (!dst->indirect_unknown_callee && src->indirect_unknown_callee > + /* Don't subtract the size when dealing with callback pairs, since the > + edge has no real size. */ > + && !src->has_callback && !dst->callback) > { > info->call_stmt_size -= (eni_size_weights.indirect_call_cost > - eni_size_weights.call_cost); > @@ -3106,6 +3109,25 @@ analyze_function_body (struct cgraph_node *node, bool > early) > es, es3); > } > } > + > + /* If dealing with a parent edge, copy its summary over to its > + children as well. */ > + if (edge->has_callback) > + { > + cgraph_edge *child; > + for (child = edge->first_callback_target (); child; > + child = child->next_callback_target ()) > + { > + ipa_call_summary *es2 = ipa_call_summaries->get (child); > + es2 = ipa_call_summaries->get_create (child); > + ipa_call_summaries->duplicate (edge, child, es, es2); > + /* Unlike speculative edges, callback edges have no real > + size or time; the call doesn't exist. Reflect that in > + their summaries. */ > + es2->call_stmt_size = 0; > + es2->call_stmt_time = 0; > + } > + } > } > > /* TODO: When conditional jump or switch is known to be constant, but > diff --git a/gcc/ipa-inline-analysis.cc b/gcc/ipa-inline-analysis.cc > index c5472cb0ff0..b24116a0ca9 100644 > --- a/gcc/ipa-inline-analysis.cc > +++ b/gcc/ipa-inline-analysis.cc > @@ -417,6 +417,11 @@ do_estimate_growth_1 (struct cgraph_node *node, void > *data) > { > gcc_checking_assert (e->inline_failed); > > + /* Don't count callback edges into growth, since they are never inlined > + anyway. */ > + if (e->callback) > + continue; > + > if (cgraph_inline_failed_type (e->inline_failed) == CIF_FINAL_ERROR > || !opt_for_fn (e->caller->decl, optimize)) > { > diff --git a/gcc/ipa-inline-transform.cc b/gcc/ipa-inline-transform.cc > index d2c9a2da6de..11182b673a9 100644 > --- a/gcc/ipa-inline-transform.cc > +++ b/gcc/ipa-inline-transform.cc > @@ -798,7 +798,17 @@ inline_transform (struct cgraph_node *node) > if (!e->inline_failed) > has_inline = true; > next = e->next_callee; > - cgraph_edge::redirect_call_stmt_to_callee (e); > + if (e->has_callback) > + { > + /* Redirect child edges when redirecting their parent. */ > + cgraph_edge *cbe; > + cgraph_edge::redirect_call_stmt_to_callee (e); > + for (cbe = e->first_callback_target (); cbe; > + cbe = cbe->next_callback_target ()) > + cgraph_edge::redirect_call_stmt_to_callee (cbe); > + } > + else > + cgraph_edge::redirect_call_stmt_to_callee (e); > } > node->remove_all_references (); > > diff --git a/gcc/ipa-inline.cc b/gcc/ipa-inline.cc > index d9fc111a9e7..78dbf3c4f65 100644 > --- a/gcc/ipa-inline.cc > +++ b/gcc/ipa-inline.cc > @@ -373,6 +373,11 @@ can_inline_edge_p (struct cgraph_edge *e, bool report, > { > gcc_checking_assert (e->inline_failed); > > + /* Never inline callback edges, since the call doesn't exist in > + reality. */ > + if (e->callback) > + return false; Honza may want to comment here whether we want to invent a new "inline failed" code in cif-code.def instead in inline_failed instead. At least to me it would look more consistent (but I don't care too much). Thanks again for working on this. I hope I'll be able to review any future iterations of this earlier and definitely hope we will get the functionality to the upcoming major GCC version. Martin