https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63155

--- Comment #21 from Richard Biener <rguenth at gcc dot gnu.org> ---
So for the alternate testcase we have

Kind                   Nodes      Bytes
ssa names             753988   54287136

GIMPLE statements
Kind                   Stmts      Bytes
phi nodes             747999  381049752

Bitmaps                                                 Leak            Peak   
        Times  N searches Search iter      Type
tree-ssa-coalesce.c:705 (new_live_track)              91800:  0.0%     91800  
6469037:  1.8%     1238519      248500      heap
tree-ssa-coalesce.c:586 (ssa_conflicts_add_one)     9916176:  3.5%4977905016
341148023: 94.8%   364290728   601796535      heap
tree-ssa-live.c:937 (new_tree_live_info)          119177600: 42.5% 119177600  
2979440:  0.8%     2230467           0      heap
tree-ssa-live.c:938 (new_tree_live_info)          149077680: 53.1% 149077680  
4476926:  1.2%     4467438     1379139      heap

as expected the conflict bitmaps are the issue.

We can shave off some more compile-time by noticing the asymmetry in
live_track_process_def and use bitmap_ior_into for one half.  That should
be more cache friendly as well.

before:

> /usr/bin/time ./cc1 -quiet t2.c
38.98user 1.40system 0:40.39elapsed 99%CPU (0avgtext+0avgdata
5626832maxresident)k
0inputs+440outputs (0major+1448898minor)pagefaults 0swaps

after:

> /usr/bin/time ./cc1 -quiet t2.c
35.81user 1.48system 0:37.30elapsed 99%CPU (0avgtext+0avgdata
5628100maxresident)k
0inputs+440outputs (0major+1440532minor)pagefaults 0swaps

not as much as with the previous patch but still...

Moves the above to

tree-ssa-coalesce.c:812 (live_track_process_def)    9896264:  3.5%4958012960
339784598: 94.4%   301208226   660769345      heap

and then we can of course avoid touching bitmaps with already recorded
conflicts.  Doing that improves things to

33.18user 1.43system 0:34.62elapsed 99%CPU (0avgtext+0avgdata
5626660maxresident)k
0inputs+440outputs (0major+1390642minor)pagefaults 0swaps

tree-ssa-coalesce.c:803 (live_track_process_def)      19912:  0.0%  19892656  
1363425:  0.4%      960492     1915158      heap
tree-ssa-coalesce.c:810 (live_track_process_def)    9896264:  3.5%4958012960
339784598: 94.4%   301208226   660769345      heap

note it all seems to be a bit noisy in the end...

@@ -818,8 +794,20 @@ live_track_process_def (live_track *ptr,
   if (bitmap_bit_p (ptr->live_base_var, root))
     {
       b = ptr->live_base_partitions[root];
-      EXECUTE_IF_SET_IN_BITMAP (b, 0, x, bi)
-        ssa_conflicts_add (graph, p, x);
+      bitmap bp = graph->conflicts[p];
+      if (!bp)
+       bp = graph->conflicts[p] = BITMAP_ALLOC (&graph->obstack);
+      /* Add a conflict to P to each live base partition.  */
+      EXECUTE_IF_AND_COMPL_IN_BITMAP (b, bp, 0, x, bi)
+       {
+         gcc_checking_assert (x != (unsigned)p);
+         bitmap bx = graph->conflicts[x];
+         if (!bx)
+           bx = graph->conflicts[x] = BITMAP_ALLOC (&graph->obstack);
+         bitmap_set_bit (bx, p);
+       }
+      /* Add conflicts to each live base partition to P.  */
+      bitmap_ior_into (bp, b);
     }
 }

ideally we'd be able to combine the ior_into with the EXECUTE_IF.. walk.
Manually jump-threading the case of !bp might be worth it as well, since
we are micro-optimizing this...

Reply via email to