https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54052

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot 
gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> ---
callgrind is of couse somewhat hopeless but after 20min we have seen
23229 blocks (over 17 invocations of the function) and sofar
155551845 invocations of bitmap_ior_into from that very place (line 652).

I wonder if we really need 'unfiltered' and could not "filter" it with
DF_LR_IN.  Because obviously computing all regs ever defined in any
dominated block requires quadratic memory?

Doing

diff --git a/gcc/rtl-ssa/blocks.cc b/gcc/rtl-ssa/blocks.cc
index 8996443e8d5..9d1cd1b0365 100644
--- a/gcc/rtl-ssa/blocks.cc
+++ b/gcc/rtl-ssa/blocks.cc
@@ -649,7 +649,8 @@ function_info::place_phis (build_info &bi)
       bitmap_iterator bmi;
       unsigned int b2;
       EXECUTE_IF_SET_IN_BITMAP (&frontiers[b1], 0, b2, bmi)
-       if (bitmap_ior_into (&unfiltered[b2], b1_def)
+       if (bitmap_ior_and_into (&unfiltered[b2], b1_def,
+                                DF_LR_IN (BASIC_BLOCK_FOR_FN (m_fn, b2)))
            && !bitmap_empty_p (&frontiers[b2]))
          // Propagate the (potential) new phi node definitions in B2.
          bitmap_set_bit (worklist, b2);

shrinks the compile-time of the testcase to 63s:

 forward prop                       :   3.02 (  5%)
 combiner                           :  30.01 ( 47%)
 TOTAL                              :  63.59
63.59user 0.84system 1:04.45elapsed 99%CPU (0avgtext+0avgdata
1127464maxresident)k
0inputs+25160outputs (0major+262276minor)pagefaults 0swaps

Untested, of course.  Testing now.

Reply via email to