https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54052
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> --- callgrind is of couse somewhat hopeless but after 20min we have seen 23229 blocks (over 17 invocations of the function) and sofar 155551845 invocations of bitmap_ior_into from that very place (line 652). I wonder if we really need 'unfiltered' and could not "filter" it with DF_LR_IN. Because obviously computing all regs ever defined in any dominated block requires quadratic memory? Doing diff --git a/gcc/rtl-ssa/blocks.cc b/gcc/rtl-ssa/blocks.cc index 8996443e8d5..9d1cd1b0365 100644 --- a/gcc/rtl-ssa/blocks.cc +++ b/gcc/rtl-ssa/blocks.cc @@ -649,7 +649,8 @@ function_info::place_phis (build_info &bi) bitmap_iterator bmi; unsigned int b2; EXECUTE_IF_SET_IN_BITMAP (&frontiers[b1], 0, b2, bmi) - if (bitmap_ior_into (&unfiltered[b2], b1_def) + if (bitmap_ior_and_into (&unfiltered[b2], b1_def, + DF_LR_IN (BASIC_BLOCK_FOR_FN (m_fn, b2))) && !bitmap_empty_p (&frontiers[b2])) // Propagate the (potential) new phi node definitions in B2. bitmap_set_bit (worklist, b2); shrinks the compile-time of the testcase to 63s: forward prop : 3.02 ( 5%) combiner : 30.01 ( 47%) TOTAL : 63.59 63.59user 0.84system 1:04.45elapsed 99%CPU (0avgtext+0avgdata 1127464maxresident)k 0inputs+25160outputs (0major+262276minor)pagefaults 0swaps Untested, of course. Testing now.