On Wed, 25 Sep 2024, Andrew MacLeod wrote:

> 
> On 9/25/24 06:51, Richard Biener wrote:
> > dom_oracle::register_transitives contains an unbound dominator walk
> > which for the testcase in PR114855 dominates the profile.  I've also
> > noticed odd behavior in the case when set_one_relation returns NULL,
> > we'd then completely abort processing other relations.  The following
> 
> 
> I think the rationale for that was originally we only get NULL back when we
> tried to set a relation which was already present... which means any further
> processing should already be reflected above it.

OK, I'll back out that particular change then and instead add a comment.
I wasn't sure this would be the case as we don't seem to process
transitives of transitives (but I also didn't spend a lot of time
thinking about that).

> Since then, we added a NULL return if we are trying to set a relation and we
> are exceeding the number of relations in the block that are allowed
> (param_relation_block_limit).  In this case, its possible we might miss
> relations higher in the dominator structure, but Once we start hitting a limit
> like that, one could also argue that we are already missing perfection,  have
> large numbers of things, and perhaps its best to stop anyway :-)

If we hit the limit we wouldn't add anything, set_one_relation will
try to add to the root_block independent of the BB it picked up
relations to combine with.

> > fixes the latter by continuing searching and fixes the unbound work
> > done by assigning a constant work budget to the loop, bounding the
> > number of dominators visited but also the number of relations
> > processed.  This gets both dom_oracle::register_transitives and
> > get_immediate_dominator off the profile.
> >
> > I'll note that we're still doing an unbound dominator walk via
> > equiv_set in find_equiv_dom at the start of the function and when
> > we register a relation that also looks up the same way.  At least
> > for the testcase at hand this isn't an issue.
> 
> Equivalences tend to be much smaller scale since they aren't a bunch of
> individual relations that can have various relations.  I havent seen much
> issue with equivalence processing.

The issue noted is more that when there is an equivalence recorded
we'll always walk up to the dominator root to find it.  The
m_equiv_set bitmap probably avoids processing in most cases when
there's no equivalency though.

> 
> >
> > I've wondered whether instead of having a per-BB bitmap of names
> > we have a relation for in that BB having a bitmap per SSA name of
> > which BBs have a relation for it would be a way to avoid all those
> > walks.
> 
> There is a global bitmap which says whether an SSA even has a relation
> anywhere, and if it doesn't we don't do the walk at all..  So we are already
> avoiding the walk if we know there cant be a relation in any BB.  that already
> eliminates most of the unnecessary walks.
> 
> If we had a per-SSA bitmap, we'd still have to walk the dominator tree and
> check that SSA bitmap for each BB to see if block has a relation or not.. so I
> think it would amount to the same thing? I think your just trading a BB based
> bitmap for an SSA name based bitmap, and the end result would be the same I
> think?  Well perhaps that would be a bit more efficient since its the same
> bitmap all the time rather than a different one for each BB.
> 
> I added a note to the PR before I saw this... we can just disable transitives
> when the graph gets too big... I don't think they are worth the expense when
> things get large.   I've attached that patch if you want to consider it.  It
> probably should have been included when I added the switchover to fast VRP,
> but I wasn't thinking about ranger still being used by the threader.
> 
> That said,  I'm also OK with your patch if you'd rather to go with it.

To comment here and not in the PR aboud simply disabling feature X when
"the CFG is too large" (or other metric) - I think we should try to be
more nuanced and apply limiting so there isn't a hard border on some
size metric where before GCC get's slower with increasing size and
after we're fast again but lousy to optimize even "cheap" cases.

That said, my main issue right now is that DOM at -O1 is comparatively
very expensive on those testcases - if you look at a profile 98% of
it's time is spent in ranger code.  Fortunately we do have
-fexpensive-optimizations, so we maybe want to use that to default
transitives but I'm also thinking whether it's possible to restrict
DOMs ranger processing to global ranges?  DOM does enable_ranger ()
and not just it's range-query so I'm not sure I can easily subsitute
global range query?

Anyway, I'm re-testing the following revised version and will install it.

Thanks,
Richard.

>From 48fb0382d23ad6c58ca7573965843509828a07e8 Mon Sep 17 00:00:00 2001
From: Richard Biener <rguent...@suse.de>
Date: Wed, 25 Sep 2024 10:38:12 +0200
Subject: [PATCH] tree-optimization/114855 - speed up
 dom_oracle::register_transitives
To: gcc-patches@gcc.gnu.org

dom_oracle::register_transitives contains an unbound dominator walk
which for the testcase in PR114855 dominates the profile.  The following
fixes the unbound work done by assigning a constant work budget to the
loop, bounding the number of dominators visited but also the number of
relations processed.  This gets both dom_oracle::register_transitives and
get_immediate_dominator off the profile.

I'll note that we're still doing an unbound dominator walk via
equiv_set in find_equiv_dom at the start of the function and when
we register a relation that also looks up the same way.  At least
for the testcase at hand this isn't an issue.

I've also amended the guard to register_transitives with the
per-basic-block limit for the number of relations registered not
being exhausted.

        PR tree-optimization/114855
        * params.opt (--param transitive-relations-work-bound): New.
        * doc/invoke.texi (--param transitive-relations-work-bound):
        Document.
        * value-relation.cc (dom_oracle::register_transitives):
        Assing an overall work budget, bounding the dominator walk and
        the number of relations processed.
        (dom_oracle::record): Only register_transitives when the
        number of already registered relations does not yet exceed
        the per-BB limit.
---
 gcc/doc/invoke.texi   |  3 +++
 gcc/params.opt        |  4 ++++
 gcc/value-relation.cc | 55 +++++++++++++++++++++++++++++--------------
 3 files changed, 44 insertions(+), 18 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index bdbbea53666..bd1208a62ee 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -17144,6 +17144,9 @@ in the outgoing range calculator.
 @item relation-block-limit
 Maximum number of relations the oracle will register in a basic block.
 
+@item transitive-relations-work-bound
+Work bound when discovering transitive relations from existing relations.
+
 @item min-pagesize
 Minimum page size for warning purposes.
 
diff --git a/gcc/params.opt b/gcc/params.opt
index 949b4754498..a08e4c1042d 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -924,6 +924,10 @@ outgoing range calculator.
 Common Joined UInteger Var(param_relation_block_limit) Init(200) 
IntegerRange(0, 9999) Param Optimization
 Maximum number of relations the oracle will register in a basic block.
 
+-param=transitive-relations-work-bound=
+Common Joined UInteger Var(param_transitive_relations_work_bound) Init(256) 
IntegerRange(0, 9999) Param Optimization
+Work bound when discovering transitive relations from existing relations.
+
 -param=rpo-vn-max-loop-depth=
 Common Joined UInteger Var(param_rpo_vn_max_loop_depth) Init(7) 
IntegerRange(2, 65536) Param Optimization
 Maximum depth of a loop nest to fully value-number optimistically.
diff --git a/gcc/value-relation.cc b/gcc/value-relation.cc
index d6ad2dd984f..d8a2ed920a8 100644
--- a/gcc/value-relation.cc
+++ b/gcc/value-relation.cc
@@ -1093,7 +1093,9 @@ dom_oracle::record (basic_block bb, relation_kind k, tree 
op1, tree op2)
       bool check = bitmap_bit_p (m_relation_set, SSA_NAME_VERSION (op1))
                   || bitmap_bit_p (m_relation_set, SSA_NAME_VERSION (op2));
       relation_chain *ptr = set_one_relation (bb, k, op1, op2);
-      if (ptr && check)
+      if (ptr && check
+         && (m_relations[bb->index].m_num_relations
+             < param_relation_block_limit))
        register_transitives (bb, *ptr);
     }
 }
@@ -1204,7 +1206,13 @@ dom_oracle::register_transitives (basic_block root_bb,
   const_bitmap equiv1 = equiv_set (relation.op1 (), root_bb);
   const_bitmap equiv2 = equiv_set (relation.op2 (), root_bb);
 
-  for (bb = root_bb; bb; bb = get_immediate_dominator (CDI_DOMINATORS, bb))
+  const unsigned work_budget = param_transitive_relations_work_bound;
+  unsigned avail_budget = work_budget;
+  for (bb = root_bb; bb;
+       /* Advancing to the next immediate dominator eats from the budget,
+         if none is left after that there's no point to continue.  */
+       bb = (--avail_budget > 0
+            ? get_immediate_dominator (CDI_DOMINATORS, bb) : nullptr))
     {
       int bbi = bb->index;
       if (bbi >= (int)m_relations.length())
@@ -1247,27 +1255,38 @@ dom_oracle::register_transitives (basic_block root_bb,
 
          // Ignore if both NULL (not relevant relation) or the same,
          if (r1 == r2)
-           continue;
+           ;
 
-         // Any operand not an equivalence, just take the real operand.
-         if (!r1)
-           r1 = relation.op1 ();
-         if (!r2)
-           r2 = relation.op2 ();
-
-         value_relation nr (relation.kind (), r1, r2);
-         if (nr.apply_transitive (*ptr))
+         else
            {
-             if (!set_one_relation (root_bb, nr.kind (), nr.op1 (), nr.op2 ()))
-               return;
-             if (dump_file && (dump_flags & TDF_DETAILS))
+             // Any operand not an equivalence, just take the real operand.
+             if (!r1)
+               r1 = relation.op1 ();
+             if (!r2)
+               r2 = relation.op2 ();
+
+             value_relation nr (relation.kind (), r1, r2);
+             if (nr.apply_transitive (*ptr))
                {
-                 fprintf (dump_file, "   Registering transitive relation ");
-                 nr.dump (dump_file);
-                 fputc ('\n', dump_file);
+                 // If the new relation is already present we know any
+                 // further processing is already reflected above it.
+                 // When we ran into the limit of relations on root_bb
+                 // we can give up as well.
+                 if (!set_one_relation (root_bb, nr.kind (),
+                                        nr.op1 (), nr.op2 ()))
+                   return;
+                 if (dump_file && (dump_flags & TDF_DETAILS))
+                   {
+                     fprintf (dump_file,
+                              "   Registering transitive relation ");
+                     nr.dump (dump_file);
+                     fputc ('\n', dump_file);
+                   }
                }
            }
-
+         /* Processed one relation, abort if we've eaten up our budget.  */
+         if (--avail_budget == 0)
+           return;
        }
     }
 }
-- 
2.43.0

Reply via email to