On Wed, 19 Sep 2018, Richard Biener wrote:

> 
> The second testcase in the above PR runs into our O(N) bitmap element
> search limitation and spends 8s (60%) of the compile-time in the SSA 
> propagator
> engine (when optimizing).  The patch improves that to 0.9s (15%).  For the 
> first testcase it isn't that bad but still the patch improves CCP from 36% to 
> 14%.
> 
> The "solution" is to use sbitmap instead of a bitmap to avoid
> the linearity when doing add_ssa_edge.  We pay for that (but not
> actually with the testcases) with a linear-time bitmap_first_set_bit
> in process_ssa_edge_worklist.  I do not (yet...) have a testcase
> that overall gets slower with this approach.  I suppose using
> std::set<int> would "solve" the complexity issue but we'd pay
> back with horribly inefficient memory use.  Similarly with
> our sparse bitmap implementation which lacks an ordered
> first_set_bit (it only can get any set bit fast, breaking optimal
> iteration order).
> 
> If we'd only had a O(log n) search sparse bitmap implementation ...
> (Steven posted patches to switch bitmap from/to such one but IIRC
> that at least lacked bitmap_first_set_bit).
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
> 
> OK for trunk?

So it turns out that while the bitmap data structure isn't optimal
the issue with the testcase is that we end up with a full-set-universe
for the SSA worklist mostly because we are queueing PHIs via uses
on backedges that are not yet executable (so we'd re-simulate the PHI
anyway once that became excutable - no need to set the individual bits).

So I'm testing the following instead.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

Richard.

2018-09-20  Richard Biener  <rguent...@suse.de>

        PR tree-optimization/63155
        * tree-ssa-propagate.c (add_ssa_edge): Avoid adding PHIs to
        the worklist when the edge of the respective argument isn't
        executable.

Index: gcc/tree-ssa-propagate.c
===================================================================
--- gcc/tree-ssa-propagate.c    (revision 264438)
+++ gcc/tree-ssa-propagate.c    (working copy)
@@ -168,10 +168,18 @@ add_ssa_edge (tree var)
   FOR_EACH_IMM_USE_FAST (use_p, iter, var)
     {
       gimple *use_stmt = USE_STMT (use_p);
+      basic_block use_bb = gimple_bb (use_stmt);
 
       /* If we did not yet simulate the block wait for this to happen
          and do not add the stmt to the SSA edge worklist.  */
-      if (! (gimple_bb (use_stmt)->flags & BB_VISITED))
+      if (! (use_bb->flags & BB_VISITED))
+       continue;
+
+      /* If this is a use on a not yet executable edge do not bother to
+        queue it.  */
+      if (gimple_code (use_stmt) == GIMPLE_PHI
+         && !(EDGE_PRED (use_bb, PHI_ARG_INDEX_FROM_USE (use_p))->flags
+              & EDGE_EXECUTABLE))
        continue;
 
       if (prop_simulate_again_p (use_stmt)

Reply via email to