On 6/24/19 9:24 AM, Richard Biener wrote:
On Fri, Jun 21, 2019 at 1:41 PM Aldy Hernandez <al...@redhat.com> wrote:

Hi Richard.  Hi folks.

In order to unify the APIs for value_range and irange, we'd like to make
some minor changes to value_range.  We believe most of these changes
could go in now, and would prefer so, to get broader testing and
minimize the plethora of changes we drag around on our branch.

First, introduce a type for VR_VARYING and VR_UNDEFINED.
--------------------------------------------------------

irange utilizes 0 or more sub-ranges to represent a range, and VARYING
is simply one subrange [MIN, MAX].    value_range represents this with
VR_VARYING, and since there is no type associated with it, we cannot
calculate the lower and upper bounds for the range.  There is also a
lack of canonicalness in value range in that VR_VARYING and [MIN, MAX]
are two different representations of the same value.

We tried to adjust irange to not associate a type with the empty range
[] (representing undefined), but found we were unable to perform all
operations properly.  In particular, we cannot invert an empty range.
i.e. invert ( [] ) should produce [MIN, MAX].  Again, we need to have a
type associated with this empty range.

We'd like to tweak value_range so that set_varying() and set_undefined()
both take a type, and then always set the min/max fields based on that
type.  This takes no additional memory in the structure, and is
virtually transparent to all the existing uses of value_range.

This allows:
    1)  invert to be implemented properly for both VARYING and UNDEFINED
by simply changing one to the other.
    2)  the type() method to always work without any special casing by
simply returning TREE_TYPE(min)
    3)  the new incoming bounds() routines to work trivially for these
cases as well (lbound/ubound, num_pairs(), etc).

This functionality is provided in the first attached patch.

Note, the current implementation sets min/max to TREE_TYPE, not to
TYPE_MIN/MAX_VALUE.  We can fix this if preferred.

How does this work with

value_range *
vr_values::get_value_range (const_tree var)
{
   static const value_range vr_const_varying (VR_VARYING, NULL, NULL);
...
   /* If we query the range for a new SSA name return an unmodifiable VARYING.
      We should get here at most from the substitute-and-fold stage which
      will never try to change values.  */
   if (ver >= num_vr_values)
     return CONST_CAST (value_range *, &vr_const_varying);

?

Good question. This glaring omission came about after a full round of tests on our branch immediately after posting :).

I am currently just allocating a new one each time:

   if (ver >= num_vr_values)
-    return CONST_CAST (value_range *, &vr_const_varying);
+    {
+      /* ?? At some point we should find a way to cache varying ranges
+        by type.  In the tree type itself?  */
+      vr = vrp_value_range_pool.allocate ();
+      vr->set_varying (type);
+      return vr;
+    }

but we should discuss alternatives. Ideally, we had batted around the idea of keeping the range for varying, cached in the type itself, because of its prevalence. I think Andrew mentioned it would increase the size of type nodes by 4%. Are there that many types that this would incur a significant penalty? Another alternative would be a cache on the side. What are your thoughts?


Second, enforce canonicalization at value_range build time.
-----------------------------------------------------------

As discussed above, value_range has multiple representations for the
same range.  For instance, ~[0,0] is the same as [1,MAX] in unsigned and
[MIN, MAX] is really varying, among others.  We found it quite difficult
to make things work, with multiple representations for a given range.
Canonicalizing at build time solves this, as well as removing explicit
set_and_canonicalize() calls throughout.  Furthermore, it avoids some
special casing in VRP.

Along with canonicalizing, we also enforce the existing value_range API
more strongly.  For instance, we don't allow setting equivalences for
either VR_VARYING or VR_UNDEFINED.

This functionality is provided in the second patch.

Fair enough.  Didn't look at the patch yet, sending separate mails would have
been prefered - or are the patches not independent of each other?  Note

Well, the varying/undefined patch goes first, but the patches are conceptually independent of each other. I posted all so you could see how everything fit together.

canonicalization performs quite some work so a shortcut
set () with just checking the input is already canonicalized would be nice?

I wonder you still have anti-ranges since you can handle > 1 subranges
in ranger?

The ranger uses the more abstract API of looking at upper_bound(), lower_bound() and num_pairs(). It has not knowledge of anti-ranges, or the internal representation. So you can have your cake and eat it too :). value_range can have its anti ranges, and the ranger can work with either or, while looking at things from a higher level.


Third, irange on value_range implementation.
---------------------------------------------

The third attached patch shows how we use the above two to implement
irange using value_ranges.  value_range would be a drop-in replacement
for irange, by just doing the following in range.h:

+// Enable this to implement irange piggybacking on value_range.
+#define IRANGE_WITH_VALUE_RANGE 1
+
+#if IRANGE_WITH_VALUE_RANGE
+#include "tree-vrp.h"
+typedef value_range_base irange;
+typedef value_range_storage irange_storage;
+#define IRANGE_PLAIN VR_RANGE
+#define IRANGE_INVERSE VR_ANTI_RANGE
+#else
...

The additions to the value_range API would be mostly the following (full
details in the third attached patch):

+  value_range_base (tree, tree);
+  value_range_base (value_range_kind,
+                   tree type, const wide_int &, const wide_int &);
+  value_range_base (tree type, const wide_int &, const wide_int &);
+  value_range_base (tree type, const value_range_storage *);
+  value_range_base (tree type);

     void set (value_range_kind, tree, tree);
     void set (tree);
@@ -77,7 +86,25 @@ public:
     bool singleton_p (tree *result = NULL) const;
     void dump (FILE *) const;

+  /* Support machinery for irange.  */
+  static const unsigned int m_max_pairs = 2;
+  static bool supports_type_p (tree type);
+  static bool supports_ssa_p (tree ssa);
+  static bool supports_p (tree expr);
+  void cast (tree);
+  bool contains_p (tree) const;
+  unsigned num_pairs () const;
+  wide_int lower_bound (unsigned = 0) const;
+  wide_int upper_bound (unsigned) const;
+  wide_int upper_bound () const;
+  void invert ();
+  void dump () const { dump (stderr); }
+  // FIXME: Perhaps rewrite the irange versions to use pointers instead.
+  void union_ (const value_range_base &);
+  void intersect (const value_range_base &);
+
   protected:
+  value_range_base normalize_symbolics () const;

There are no regressions from mainline, and we are catching every one of
our internal ranger tests, with the exception of two that require more
than 2 sub-ranges to work.  i.e. Not a regression from mainline-- just
new functionality we can't get with the limited sub-ranges in value_range.

Note: Please ignore the value_range_base::normalize_symbolics stuff.
It's likely to go through multiple revisions as Andrew gets the ranger
to deal with symbolics.

Finally
-------

All these patches are already in our branch, and irange with value_range
can be enabled by #define IRANGE_WITH_VALUE_RANGE 1.

With these changes, we can successfully build and run the ranger branch
using value_range in place of irange, which indicates a successful API
merge.

After some minor cleanups, I would like to contribute at least the first
two patches to trunk (VARYING types and the enforced canonicalization).
This will enable us to move forward with trying to integrate the
range-ops code which makes heavy use of the subrange interface, and
allows for broader testing instead of dropping everything in one-go.
These two patches stand on their own independently, and IMO provide
useful functionality right now.

Works for me.  Please send any such patches separately (after cleanup)

Ok, I am shuffling even more things in our branch in preparation for future work. When I'm done with that and can verify that things work with value_range, irange, VRP, and the ranger, I will start posting pieces independently. I just wanted to make sure we all agreed on the general approach.


I would ideally like to contribute the third patch to mainline, but I do
understand that it currently has no users... although I could rewrite
bits of tree-vrp to use these new functions (lower_bound, upper_bound,
etc), thus providing a use case ??.  However, I do understand if you'd
prefer to keep this last patch on the branch.

I'd prefer to keep the last one on the branch indeed.

Alrighty.

Thanks.
Aldy


Richard.

Thoughts?

Aldy (and Andrew)

Reply via email to