15 Regression] compiler hang at -O{1,2,3,s} since r14-5109

amacleod at redhat dot com via Gcc-bugs Tue, 15 Apr 2025 16:53:12 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119712


--- Comment #7 from Andrew Macleod <amacleod at redhat dot com> ---
Bah. almost but not quite.

Rangers cache algorithm must remain as it is... it will always produce a better
result.. but we need to calculate everything properly.
Going back to the original analysis:

FWD visiting block 4 for _11  starting range : [irange] int [-INF, 2147483644]
MASK 0xfffffffc VALUE 0x1
  edge 6->4 :[irange] int [-INF, 2147483644] MASK 0xfffffffc VALUE 0x1
  edge 7->4 :[irange] int [0, 1]
  edge 3->4 :[irange] int [1, 1] MASK 0xfffffffc VALUE 0x1
Updating range to [irange] int [-INF, 2147483644] MASK 0xfffffffd VALUE 0x0

and second iteration:
FWD visiting block 4 for _11  starting range : [irange] int [-INF, 2147483644]
MASK 0xfffffffd VALUE 0x0
  edge 6->4 :[irange] int [-INF, 2147483644] MASK 0xfffffffc VALUE 0x1
  edge 7->4 :[irange] int [1, 1]
  edge 3->4 :[irange] int [1, 1] MASK 0xfffffffc VALUE 0x1
  Updating range to [irange] int [-INF, 2147483644] MASK 0xfffffffc VALUE 0x1

The initial value of _11 is actually more precise than the second.  It excludes
the range of [0, 0] which the second one does not.
The evaluation on the edge from 7->4 is the important thing to note.  On the
first iteration, it includes 0 which is why _11 contains 0 on the second
iteration.  Then it is reduces to 1 in the second case.

During the GORI evaluation cycle, we end up utilizing adjust_range() during the
calculation of 7->4 the second time because an intersection actually does
something, and so we evaluate _11 with the mask expanded to -INF, 1][4,
2147483626] instead of [-INF, 2147483644].. this eliminates 0 from the range on
that edge, and ends up causing the cycle of flip=flopping values.

The Right thing to do would be to always expand those bits when we set a
bitmask.  The initial range would then become:

[irange] int [-INF, -3][1, 1][4, 2147483644] MASK 0xfffffffc VALUE 0x1

And when this feeds into the update cycle, we no longer have this flip-flop
issue because it contains the lack of zero on both iterations. This eliminates
any bitmask effect on the algorithm and it returns to being correct.

I am running regressions on this change now.  There is a small timing penalty
as making these adjustments to all ranges is a bit pervasive.  Its only .18%
over a compilation unit, but it adds up to about 1.3% in VRP.  ick. Well
probably see some minor improvement in other places due to the better
subranges... but I will keep thinking about how to tweak that

This is the new potential fix:

diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index 5136674b361..0730df2ded7 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -2341,6 +2341,8 @@ irange::update_bitmask (const irange_bitmask &bm)
   m_bitmask = bm;
   if (!set_range_from_bitmask ())
     normalize_kind ();
+  // Trim out lower values from the range the which bitmask disallows.
+  m_bitmask.adjust_range (*this);
   if (flag_checking)
     verify_range ();
 }

[Bug tree-optimization/119712] [14/15 Regression] compiler hang at -O{1,2,3,s} since r14-5109

Reply via email to