A gentle ping > On 30 Oct 2025, at 10:42 AM, Prachi Godbole <[email protected]> wrote: > > This patch attempts to reduce compile time for locality cloning pass by > reducing recursive calls to partition_callchain (). This is achieved by > precomputing caller callee information into locality_info. locality_info > stores all callees of a node, either directly or via inlined nodes thereby > avoiding calls to partition_callchain () for inlined nodes which are already > partitioned with their inlined_to nodes. locality_info stores precomputed > accumulated incoming edge frequencies per unique caller and avoids repeated > computation within partition_callchain (). It also stores preaccumulated and > sorted outgoing edge frequencies for unique callees. > > This patch refines is_entry_node_p () check by calling local_p () instead of > just alias check. > > Approximately 45% compile time improvement is observed for > bootstrap-lto-locality config, and takes 2-5% more time on top of > bootstrap-lto. > > This patch also handles appropriate memory management of pass specific data > structures. > > Bootstrapped and tested on aarch64-none-linux-gnu. > Ok for mainline? > > Thanks, > Prachi > > Signed-off-by: Prachi Godbole <[email protected]> > > gcc/ChangeLog: > > * ipa-locality-cloning.cc (struct locality_callee_info): New struct. > (struct locality_info): Ditto. > (loc_infos): Ditto. > (get_locality_info): New function. > (sort_all_callees_default): Ditto. > (callee_default_cmp): Ditto. > (populate_callee_locality_info): Ditto. > (populate_caller_locality_info): Ditto. > (create_locality_info): Ditto. > (adjust_recursive_callees): Access node_to_clone by reference. > (inline_clones): Access node_to_clone and clone_to_node by reference. > (clone_node_as_needed): Ditto. > (accumulate_incoming_edge_frequency): Remove function. > (clone_node_p): New function. > (partition_callchain): Refactor the function. > (is_entry_node_p): Call local_p (). > (locality_determine_ipa_order): Call create_locality_info (). > (locality_determine_static_order): Ditto. > (locality_partition_and_clone): Update call to partition_callchain () > according prototype. > (lc_execute): Allocate and free node_to_ch_info, node_to_clone, > clone_to_node. > > <0002-PATCH-2-3-ipa-reorder-for-locality-Address-compile-t.patch>
