https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63155
--- Comment #21 from Richard Biener <rguenth at gcc dot gnu.org> --- So for the alternate testcase we have Kind Nodes Bytes ssa names 753988 54287136 GIMPLE statements Kind Stmts Bytes phi nodes 747999 381049752 Bitmaps Leak Peak Times N searches Search iter Type tree-ssa-coalesce.c:705 (new_live_track) 91800: 0.0% 91800 6469037: 1.8% 1238519 248500 heap tree-ssa-coalesce.c:586 (ssa_conflicts_add_one) 9916176: 3.5%4977905016 341148023: 94.8% 364290728 601796535 heap tree-ssa-live.c:937 (new_tree_live_info) 119177600: 42.5% 119177600 2979440: 0.8% 2230467 0 heap tree-ssa-live.c:938 (new_tree_live_info) 149077680: 53.1% 149077680 4476926: 1.2% 4467438 1379139 heap as expected the conflict bitmaps are the issue. We can shave off some more compile-time by noticing the asymmetry in live_track_process_def and use bitmap_ior_into for one half. That should be more cache friendly as well. before: > /usr/bin/time ./cc1 -quiet t2.c 38.98user 1.40system 0:40.39elapsed 99%CPU (0avgtext+0avgdata 5626832maxresident)k 0inputs+440outputs (0major+1448898minor)pagefaults 0swaps after: > /usr/bin/time ./cc1 -quiet t2.c 35.81user 1.48system 0:37.30elapsed 99%CPU (0avgtext+0avgdata 5628100maxresident)k 0inputs+440outputs (0major+1440532minor)pagefaults 0swaps not as much as with the previous patch but still... Moves the above to tree-ssa-coalesce.c:812 (live_track_process_def) 9896264: 3.5%4958012960 339784598: 94.4% 301208226 660769345 heap and then we can of course avoid touching bitmaps with already recorded conflicts. Doing that improves things to 33.18user 1.43system 0:34.62elapsed 99%CPU (0avgtext+0avgdata 5626660maxresident)k 0inputs+440outputs (0major+1390642minor)pagefaults 0swaps tree-ssa-coalesce.c:803 (live_track_process_def) 19912: 0.0% 19892656 1363425: 0.4% 960492 1915158 heap tree-ssa-coalesce.c:810 (live_track_process_def) 9896264: 3.5%4958012960 339784598: 94.4% 301208226 660769345 heap note it all seems to be a bit noisy in the end... @@ -818,8 +794,20 @@ live_track_process_def (live_track *ptr, if (bitmap_bit_p (ptr->live_base_var, root)) { b = ptr->live_base_partitions[root]; - EXECUTE_IF_SET_IN_BITMAP (b, 0, x, bi) - ssa_conflicts_add (graph, p, x); + bitmap bp = graph->conflicts[p]; + if (!bp) + bp = graph->conflicts[p] = BITMAP_ALLOC (&graph->obstack); + /* Add a conflict to P to each live base partition. */ + EXECUTE_IF_AND_COMPL_IN_BITMAP (b, bp, 0, x, bi) + { + gcc_checking_assert (x != (unsigned)p); + bitmap bx = graph->conflicts[x]; + if (!bx) + bx = graph->conflicts[x] = BITMAP_ALLOC (&graph->obstack); + bitmap_set_bit (bx, p); + } + /* Add conflicts to each live base partition to P. */ + bitmap_ior_into (bp, b); } } ideally we'd be able to combine the ior_into with the EXECUTE_IF.. walk. Manually jump-threading the case of !bp might be worth it as well, since we are micro-optimizing this...