On 9/25/24 06:51, Richard Biener wrote:
dom_oracle::register_transitives contains an unbound dominator walk which for the testcase in PR114855 dominates the profile. I've also noticed odd behavior in the case when set_one_relation returns NULL, we'd then completely abort processing other relations. The following
I think the rationale for that was originally we only get NULL back when we tried to set a relation which was already present... which means any further processing should already be reflected above it.
Since then, we added a NULL return if we are trying to set a relation and we are exceeding the number of relations in the block that are allowed (param_relation_block_limit). In this case, its possible we might miss relations higher in the dominator structure, but Once we start hitting a limit like that, one could also argue that we are already missing perfection, have large numbers of things, and perhaps its best to stop anyway :-)
fixes the latter by continuing searching and fixes the unbound work done by assigning a constant work budget to the loop, bounding the number of dominators visited but also the number of relations processed. This gets both dom_oracle::register_transitives and get_immediate_dominator off the profile. I'll note that we're still doing an unbound dominator walk via equiv_set in find_equiv_dom at the start of the function and when we register a relation that also looks up the same way. At least for the testcase at hand this isn't an issue.
Equivalences tend to be much smaller scale since they aren't a bunch of individual relations that can have various relations. I havent seen much issue with equivalence processing.
I've wondered whether instead of having a per-BB bitmap of names we have a relation for in that BB having a bitmap per SSA name of which BBs have a relation for it would be a way to avoid all those walks.
There is a global bitmap which says whether an SSA even has a relation anywhere, and if it doesn't we don't do the walk at all.. So we are already avoiding the walk if we know there cant be a relation in any BB. that already eliminates most of the unnecessary walks.
If we had a per-SSA bitmap, we'd still have to walk the dominator tree and check that SSA bitmap for each BB to see if block has a relation or not.. so I think it would amount to the same thing? I think your just trading a BB based bitmap for an SSA name based bitmap, and the end result would be the same I think? Well perhaps that would be a bit more efficient since its the same bitmap all the time rather than a different one for each BB.
I added a note to the PR before I saw this... we can just disable transitives when the graph gets too big... I don't think they are worth the expense when things get large. I've attached that patch if you want to consider it. It probably should have been included when I added the switchover to fast VRP, but I wasn't thinking about ranger still being used by the threader.
That said, I'm also OK with your patch if you'd rather to go with it. Andrew
* gimple-range-cache.cc (ranger_cache::ranger_cache): Do not process transitives when the CFG is too large. diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc index 4b94e0db7aa..f665925f630 100644 --- a/gcc/gimple-range-cache.cc +++ b/gcc/gimple-range-cache.cc @@ -1003,7 +1003,8 @@ ranger_cache::ranger_cache (int not_executable_flag, bool use_imm_uses) m_temporal = new temporal_cache; // If DOM info is available, spawn an oracle as well. - create_relation_oracle (); + bool transitives_p = (last_basic_block_for_fn (cfun) < param_vrp_block_limit); + create_relation_oracle (transitives_p); create_infer_oracle (use_imm_uses); create_gori (not_executable_flag, param_vrp_switch_limit);