On 9/18/20 12:38 PM, Aldy Hernandez via Gcc-patches wrote:
As part of the ranger work, we have been trying to clean up and generalize interfaces whenever possible.  This not only helps in reducing the maintenance burden going forward, but provides mechanisms for backwards compatibility between ranger and other providers/users of ranges throughout the compiler like evrp and VRP.

One such interface is the range_query class in vr_values.h, which provides a range query mechanism for use in the simplify_using_ranges module.  With it, simplify_using_ranges can be used with the ranger, or the VRP twins by providing a get_value_range() method.  This has helped us in comparing apples to apples while doing our work, and has also future proofed the interface so that asking for a range can be done within the context in which it appeared.  For example, get_value_range now takes a gimple statement which provides context.  We are no longer tied to asking for a global SSA range, but can ask for the range of an SSA within a statement.  Granted, this functionality is currently only in the ranger, but evrp/vrp could be adapted to pass such context.

The range_query is a good first step, but what we really want is a generic query mechanism that can ask for SSA ranges within an expression, a statement, an edge, or anything else that may come up.  We think that a generic mechanism can be used not only for range producers, but consumers such as the substitute_and_fold_engine (see get_value virtual) and possibly the gimple folder (see valueize).

The attached patchset provides such an interface.  It is meant to be a replacement for range_query that can be used for vr_values, substitute_and_fold, the subsitute_and_fold_engine, as well as the ranger.  The general API is:

class value_query
{
public:
   // Return the singleton expression for NAME at a gimple statement,
   // or NULL if none found.
   virtual tree value_of_expr (tree name, gimple * = NULL) = 0;
   // Return the singleton expression for NAME at an edge, or NULL if
   // none found.
   virtual tree value_on_edge (edge, tree name);
   // Return the singleton expression for the LHS of a gimple
   // statement, assuming an (optional) initial value of NAME.  Returns
   // NULL if none found.
   //
   // Note this method calculates the range the LHS would have *after*
   // the statement has executed.
   virtual tree value_of_stmt (gimple *, tree name = NULL);
};

class range_query : public value_query
{
public:
   range_query ();
   virtual ~range_query ();

   virtual tree value_of_expr (tree name, gimple * = NULL) OVERRIDE;
   virtual tree value_on_edge (edge, tree name) OVERRIDE;
   virtual tree value_of_stmt (gimple *, tree name = NULL) OVERRIDE;

   // These are the range equivalents of the value_* methods.  Instead
   // of returning a singleton, they calculate a range and return it in
   // R.  TRUE is returned on success or FALSE if no range was found.
   virtual bool range_of_expr (irange &r, tree name, gimple * = NULL) = 0;
   virtual bool range_on_edge (irange &r, edge, tree name);
   virtual bool range_of_stmt (irange &r, gimple *, tree name = NULL);

   // DEPRECATED: This method is used from vr-values.  The plan is to
   // rewrite all uses of it to the above API.
   virtual const class value_range_equiv *get_value_range (const_tree,
                               gimple * = NULL);
};

The duality of the API (value_of_* and range_on_*) is because some passes are interested in a singleton value (substitute_and_fold_enginge), while others are interested in ranges (vr_values).  Passes that are only interested in singletons can take a value_query, while passes that are interested in full ranges, can take a range_query.  Of course, for future proofing, we would recommend taking a range_query, since if you provide a default range_of_expr, sensible defaults will be provided for the others in terms of range_of_expr.

Note, that the absolute bare minimum that must be provided is a value_of_expr and a range_of_expr respectively.

One piece of the API which is missing is a method  to return the range of an arbitrary SSA_NAME *after* a statement.  Currently range_of_expr calculates the range of an expression upon entry to the statement, whereas range_of_stmt calculates the range of *only* the LHS of a statement AFTER the statement has executed.

This would allow for complete representation of the ranges/values in something like:

     d_4 = *g_7;

Here the range of g_7 upon entry could be VARYING, but after the dereference we know it must be non-zero.  Well for sane targets anyhow.

Choices would be to:

   1) add a 4th method such as "range_after_stmt", or

  2) merge that functionality with the existing range_of_stmt method to provide "after" functionality for any ssa_name.  Currently the SSA_NAME must be the same as the LHS if specified.  It also does not need to be specified to allow evaluation of statements without a LHS, (if (x < 1) has no LHS, but will return [0,0] [1,1] or [0,1] as there is a boolean "result" to the statement.

Also, we've debated whether to use the two classes (value_query and range_query) separately as above, or perhaps join them into one class. I personally like the separation of powers.  Some users only care for singletons, no sense in polluting their implementations with talk of ranges.  But we're open to suggestions.

The patchset is divided into 3 parts:

1. Generic implementation of class.

Just a few comments on the new APIs in part 1.

Here I would again recommend preventing the copying and assigning
the new range_query class.  Ditto for equiv_allocator unless its
base takes care of that.  For value_query, since it has a pure
virtual function, I would suggest to declare its default ctor
protected (and = default).  That way, if someone tries to define
an object of it, they'll get a nice compilation error as opposed
to a confusing linker error.

The range_query default ctor uses the new epression to allocate
and construct its equiv_allocator object but the range_query dtor
just calls release on the object without invoking the delete
expression on it.  Unless the release() function does that (I'd
hope not but I have seen it) that seems like a leak.

Also, defining virtual functions to take default arguments tends
to be tricky and is generally discouraged.  Usually the default
argument value from the base class is used, unless the overridden
virtual function is called directly (either via Derived::foo()
or via ptr_to_derived->foo()).  To avoid surprises all overridden
virtual functions need to have the same defaults as the base,
which is easy to get wrong without compiler warnings.

One way to get around this is to define a non-virtual public
function with defaults in the base class and have it call
a protected virtual without the defaults.

Finally, unless both a type and function with the same name exist
in the same scope there is no reason to mention the class-id when
referencing a class name.  I.e., this

  value_range_equiv *allocate_value_range_equiv ();
  void free_value_range_equiv (value_range_equiv *);

is the same as this:

  class value_range_equiv *allocate_value_range_equiv ();
  void free_value_range_equiv (class value_range_equiv *);

but the former is shorter and clearer (and in line with existing
practice).

I'm hoping to look at the remaining parts soon (especially part
3) but right now I'm being summoned to dinner.

Martin


2. Conversion of vr_values and substitute_and_fold_engine to class.

The conversion of the substitute_and_fold_engine class involved changing all calls to get_value() to a corresponding method providing context. For example, when the edge or statement is available, we pass these to the appropriate method.

With this, we've been able to plug in the ranger into our hybrid evrp, which can then get better ranges.  Our hybrid evrp has moved a lot of optimizations that were happening much later into the early vrp stage.

3. Conversion of sprintf/strlen pass to class.

This is a nonfunctional change to the sprintf/strlen passes.  That is, no effort was made to change the passes to multi-ranges.  However, with this patch, we are able to plug in a ranger or evrp with just a few lines, since the range query mechanism share a common API.

---------------------

We think this interface is generic enough to be used in any place that queries ranges or singletons, basically any place we have *valueize* in the compiler.  It can also be used to replace range generators at will, or even combine them to get the best of both worlds.  For example, places that currently have a value_* interface because they work with constants, like CCP, could have an alternate valueizer plugged in and if a range folds to a constant, CCP could then propagate that constant.

Tested on x86-64 Linux.

Aldy

Reply via email to