[COMMITTED] PR tree-optimization/120661 - Snap subrange boundries to bitmask constraints.

Andrew MacLeod Tue, 17 Jun 2025 18:28:02 -0700

These failing cases all contain a situation where the cache isattempting to update values involving a back edge. We propagate valuesin the cache, and then recalculate range-on-entry values until thevalues settle.

Unfortunately, we do not really have canonical representation for rangeswhich involve bitmasks. We kind of ad-hoc it and defer applying themask until it seems important. I had made some previous attempts toremove a few invalid ranges, specifically for trailing bits that wereknown, and removed the lower and higher subranges that were notapplicable. ie


 signed char  [-INF, +INF] MASK 0xF0 VALUE 0x0

we convert that to

[-INF, -16][0, 0][16, 112]    MASK 0xF0 VALUE 0x0

But we only do it with the lower subrange (ie all unknown bits are 0)and adjust the upper bound of the entire range. It seems expensive togo do it for every subrange in a multi-range

But this leads to some issues. in the 2 testcases int he PR, we endedup with some sub-ranges which were actually UNDEFINED if you properlyapplied the submask.

This lead to a flip flopping situation where as we iterated, the valuescalculated would cycle between 2 or 3 values. In both cases, there werevalues which were not valid in the bitmask which caused different valuesto be calculated along the way and used in evaluation producingdifferent results. As soon as the actual UNDEFINED value was reached,the original bitmask was again created and the cycle continued.

IN particular, the second test case at one point generated this rangevia some calculation:


 [2147483648, 2147483649] MASK 0xfffffc00 VALUE 0x1

which upon examination, the value 2147483648 cannot be true as thenumber must be odd. Yet other op1_range calculations using this valuetriggered an alternate value, which then produced

   [2147483648, 2147483648] MASK 0xfffffc00 VALUE 0x1

Which then caused another value to become UNDEFINED/.... and then thecycle started again with the original value produced.

If we eliminate the impossible bounds, and those stray values don'tbecome invoilved in calculations, all these problems utilizingimpossible values coming and going vanish.

THis patch is something I was working on before the PR came in, andfinishing it solves this issue.

When we adjust the range to match a bitmask in set_range_from_bitmask(),this patch now loops through the subranges, and "snaps" each boundary tothe nearest value which satisfies the bitmask. if no element of thesubrange satisfies the constraint, it is completely removed. ie


[3, 15][21,21][45,80] MASK 0xFE VALUE 0x0        //  Must be an even value.
    is "snapped" to
[4,14][46,80] MASK 0xFE VALUE 0x0

I will note we only do this for the trialing zero bits in the mask.. Itried for the entire bitmask, but that was HARD. so, its only thetrailing zeros, but considering thats the only thing we really adjust inother places, it works out OK.

Performance was not nearly as bad as I was afraid.. Only 0.4% slowerfor all of VRP and 0.04% overall.

Later I will revisit the entire bitmask implementation and see if we canbe more consistent.. make some attempt at proper canonicalization. we'vehad a number of ad-hoc improvements to it and it would be good to sitdown and see if it needs reworking.. Im pretty sure there is some workbeing done over and over we can improve on, but this is good enough for now.


Bootstraps on x86_64-pc-linux-gnu with no regressions.   Pushed.

Andrew

From 9808af57ef1d4cbafb40b2446fd6808cbf20b36d Mon Sep 17 00:00:00 2001
From: Andrew MacLeod <amacl...@redhat.com>
Date: Mon, 16 Jun 2025 15:41:47 -0400
Subject: [PATCH 1/4] Snap subrange boundries to bitmask constraints.

Ensure all subrange endpoints conform to the bitmask.

	PR tree-optimization/120661
	gcc/
	* value-range.cc (irange::snap): New.
	(irange::snap_subranges): New.
	(irange::set_range_from_bitmask): Call snap_subranges.
	* value-range.h (snap, snap_subranges): New prototypes.

	gcc/testsuite/
	* gcc.dg/pr120661-1.c: New.
	* gcc.dg/pr120661-2.c: New.
---
 gcc/testsuite/gcc.dg/pr120661-1.c | 51 +++++++++++++++++
 gcc/testsuite/gcc.dg/pr120661-2.c | 39 +++++++++++++
 gcc/value-range.cc                | 91 +++++++++++++++++++++++++++++++
 gcc/value-range.h                 |  2 +
 4 files changed, 183 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr120661-1.c
 create mode 100644 gcc/testsuite/gcc.dg/pr120661-2.c

diff --git a/gcc/testsuite/gcc.dg/pr120661-1.c b/gcc/testsuite/gcc.dg/pr120661-1.c
new file mode 100644
index 00000000000..abf9210050d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr120661-1.c
@@ -0,0 +1,51 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c23 -Os" } */
+
+typedef __builtin_va_list __gnuc_va_list;
+typedef __gnuc_va_list va_list;
+
+int e, a, b;
+int f(int b, ...) {
+  va_list args;
+  __builtin_c23_va_start(args, b);
+  unsigned c = 1;
+  for (int d; d < b; ++d)
+    c = c ^ 1;
+  return c;
+}
+static int fn3(int l, int i, int n) {
+  int j;
+  goto k;
+r:
+  j = (f(e) + 1641730381) * l + 1189664732 * n + 1064 * i - 1545337304;
+  if (903562339 * j + n >= 0)
+    goto m;
+  goto ac;
+k:
+  if (0)
+    goto ad;
+  goto t;
+ad:
+  if (b)
+    goto s;
+  goto r;
+m:
+  goto ad;
+t:
+  j = l;
+  l = 800794 * j;
+  goto ad;
+s:
+  b = 2 * b + 1;
+  if (a + (long)j)
+    goto t;
+  i = n;
+  goto s;
+ac:
+}
+int main() {
+  if (a)
+    if (fn3(-1, 1, -1))
+      fn3(1, 0, 3);
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/pr120661-2.c b/gcc/testsuite/gcc.dg/pr120661-2.c
new file mode 100644
index 00000000000..05d976db450
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr120661-2.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c23 -O2" } */
+
+typedef __builtin_va_list __gnuc_va_list;
+typedef __gnuc_va_list va_list;
+
+int a, c, d;
+int e(int b, ...) {
+    va_list args;
+    __builtin_c23_va_start(args, b);
+
+    int r = 0;
+    for (int i = 0; i < b; i++) {
+        int v = __builtin_va_arg(args, int);
+        r += v;
+    }
+    __builtin_va_end (args);
+    return r;
+}
+int f() { e(0); }
+int main() {
+  int h = 0, g = 0;
+  goto l;
+i:
+  if (f() * h)
+    goto k;
+j:
+  h = h - 2;
+k:
+  d = 1200000000 * h + 10;
+  g = (long)g + -1000000000 * d + 1;
+  if (a * h >= 0) {
+    if (g + (c - (long)1) >= 0)
+      goto i;
+    return 0;
+  }
+l:
+  goto j;
+}
diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index e2d75f59c2e..5e97fdb7691 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -2254,6 +2254,94 @@ irange::invert ()
     verify_range ();
 }
 
+// This routine will take the bounds [LB, UB], and apply the bitmask to those
+// values such that both bounds satisfy the bitmask.  TRUE is returned
+// if either bound changes, and they are retuirned as [NEW_LB, NEW_UB].
+// if NEW_UB < NEW_LB, then the entire bound is to be removed as none of
+// the values are valid.
+//   ie,   [4, 14] MASK 0xFFFE  VALUE 0x1
+// means all values must be odd, the new bounds returned will be [5, 13].
+//   ie,   [4, 4] MASK 0xFFFE  VALUE 0x1
+// would return [1, 0] and as the LB < UB,   the entire subrange is invalid
+// and should be removed.
+
+bool
+irange::snap (const wide_int &lb, const wide_int &ub,
+	      wide_int &new_lb, wide_int &new_ub)
+{
+  uint z = wi::ctz (m_bitmask.mask ());
+  if (z == 0)
+    return false;
+  const wide_int &wild_mask = m_bitmask.mask ();
+
+  const wide_int step = (wi::one (TYPE_PRECISION (type ())) << z);
+  const wide_int match_mask = step - 1;
+  const wide_int value = m_bitmask.value () & match_mask;
+
+  wide_int rem_lb = lb & match_mask;
+
+  wi::overflow_type ov_sub;
+  wide_int diff = wi::sub(value, rem_lb, UNSIGNED, &ov_sub);
+  wide_int offset = diff & match_mask;
+
+  wi::overflow_type ov1;
+  new_lb = wi::add (lb, offset, UNSIGNED, &ov1);
+
+  wide_int rem_ub = ub & match_mask;
+  wide_int offset_ub = (rem_ub - value) & match_mask;
+
+  wi::overflow_type ov2;
+  new_ub = wi::sub (ub, offset_ub, UNSIGNED, &ov2);
+
+  // Overflow or inverted range = invalid
+  if (ov1 != wi::OVF_NONE || ov2 != wi::OVF_NONE
+      || wi::lt_p (new_ub, new_lb, TYPE_SIGN (type ())))
+    {
+      new_lb = wi::one (lb.get_precision ());
+      new_ub = wi::zero (ub.get_precision ());
+      return true;
+    }
+  return (new_lb != lb) || (new_ub != ub);
+}
+
+// This method loops through the subranges in THIS, and adjusts any bounds
+// to satisfy the contraints of the BITMASK.  If a subrange is invalid,
+// it is removed.   TRUE is returned if there were any changes.
+
+bool
+irange::snap_subranges ()
+{
+  bool changed = false;
+  int_range_max invalid;
+  unsigned x;
+  wide_int lb, ub;
+  for (x = 0; x < m_num_ranges; x++)
+    {
+      if (snap (lower_bound (x), upper_bound (x), lb, ub))
+	{
+	  changed = true;
+	  //  This subrange is to be completely removed.
+	  if (wi::lt_p (ub, lb, TYPE_SIGN (type ())))
+	    {
+	      int_range<1> tmp (type (), lower_bound (x), upper_bound (x));
+	      invalid.union_ (tmp);
+	      continue;
+	    }
+	  if (lower_bound (x) != lb)
+	    m_base[x * 2] = lb;
+	  if (upper_bound (x) != ub)
+	    m_base[x * 2 + 1] = ub;
+	}
+    }
+  // Remove any subranges which are no invalid.
+  if (!invalid.undefined_p ())
+    {
+      invalid.invert ();
+      intersect (invalid);
+    }
+  return changed;
+}
+
 // If the mask can be trivially converted to a range, do so.
 // Otherwise attempt to remove the lower bits from the range.
 // Return true if the range changed in any way.
@@ -2353,6 +2441,9 @@ irange::set_range_from_bitmask ()
 
   // Make sure we call intersect, so do it first.
   changed = intersect (mask_range) | changed;
+  // Npw make sure each subrange endpoint matches the bitmask.
+  changed |= snap_subranges ();
+
   return changed;
 }
 
diff --git a/gcc/value-range.h b/gcc/value-range.h
index 74cdf29ddcb..c32c5076b63 100644
--- a/gcc/value-range.h
+++ b/gcc/value-range.h
@@ -344,6 +344,8 @@ private:
   bool intersect_bitmask (const irange &r);
   bool union_bitmask (const irange &r);
   bool set_range_from_bitmask ();
+  bool snap_subranges ();
+  bool snap (const wide_int &, const wide_int &, wide_int &, wide_int &);
 
   bool intersect (const wide_int& lb, const wide_int& ub);
   bool union_append (const irange &r);
-- 
2.45.0

[COMMITTED] PR tree-optimization/120661 - Snap subrange boundries to bitmask constraints.

Reply via email to