On Mon, Nov 3, 2025 at 11:34 AM Martin Jambor <[email protected]> wrote:
>
> Hi,
>
> On Sun, Nov 02 2025, Richard Biener wrote:
> > On Fri, Oct 31, 2025 at 10:47 PM Martin Jambor <[email protected]> wrote:
> >>
> >> Hi,
> >>
> >> this patch adds the ability to infer ranges from loads from global
> >> constant static aggregates which have static initializers. Even when
> >> the load has an ARRAY_REF with an unknown index and thus we do not know
> >> the particular constant that is being loaded, we can traverse the
> >> corresponding elements of the initializer and see if we know in what
> >> range(s) the loaded value must fall - or for pointers we can sometimes
> >> infer that the value cannot be NULL.
> >>
> >> I thought this was similar to fold_using_range::range_of_address and
> >> so I decided to put my implementation alongside of it. I hope that is
> >> the correct place.
> >>
> >> Bootstrapped, LTO-bootstrapped and tested on x86_64-linux, bootstrap
> >> on aarch64-linux is underway. OK for master if it passes?
> >
> > Comments below.
> >
> >> Thanks,
> >>
> >> Martin
> >>
> >>
> >> gcc/ChangeLog:
> >>
> >> 2025-10-31 Martin Jambor <[email protected]>
> >>
> >> * gimple-range-fold.h (class fold_using_range): New member
> >> function range_from_readonly_var.
> >> * gimple-range-fold.cc (fold_using_range::fold_stmt): Call
> >> range_from_readonly_var on assignments.
> >> (add_loaded_invariant_to_range): New function.
> >> (range_from_readonly_load): Likewise.
> >> (fold_using_range::range_from_readonly_var): Likewise.
> >> * params.opt (param_vrp_cstload_limit): New.
> >> * doc/invoke.texi (vrp-cstload-limit): Likewise.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >> 2025-10-31 Martin Jambor <[email protected]>
> >>
> >> * gcc.dg/tree-ssa/vrp-from-cst-agg-1.c: New test.
> >> * gcc.dg/tree-ssa/vrp-from-cst-agg-2.c: Likewise.
> >> * gcc.dg/tree-ssa/vrp-from-cst-agg-3.c: Likewise.
> >> * gcc.dg/tree-ssa/vrp-from-cst-agg-4.c: Likewise.
> >> * gcc.dg/ipa/vrp-from-cst-agg-5.c: Likewise.
> >> ---
> [...]
> >> diff --git a/gcc/gimple-range-fold.cc b/gcc/gimple-range-fold.cc
> >> index 06c645f3d08..b455589428e 100644
> >> --- a/gcc/gimple-range-fold.cc
> >> +++ b/gcc/gimple-range-fold.cc
> >> @@ -639,10 +639,14 @@ fold_using_range::fold_stmt (vrange &r, gimple *s,
> >> fur_source &src, tree name)
> >> if (!name)
> >> name = gimple_get_lhs (s);
> >>
> >> - // Process addresses.
> >> - if (gimple_code (s) == GIMPLE_ASSIGN
> >> - && gimple_assign_rhs_code (s) == ADDR_EXPR)
> >> - return range_of_address (as_a <prange> (r), s, src);
> >> + // Process addresses and loads from static constructors.
> >> + if (gimple_code (s) == GIMPLE_ASSIGN)
> >> + {
> >> + if (gimple_assign_rhs_code (s) == ADDR_EXPR)
> >> + return range_of_address (as_a <prange> (r), s, src);
> >> + if (range_from_readonly_var (r, s))
> >> + return true;
> >> + }
> >>
> >> gimple_range_op_handler handler (s);
> >> if (handler)
> >> @@ -891,6 +895,208 @@ fold_using_range::range_of_address (prange &r,
> >> gimple *stmt, fur_source &src)
> >> return true;
> >> }
> >>
> >> +// VAL must be an interprocedural invariant. If VAL is a pointer which
> >> cannot
> >> +// be zero, then set it to nonzero if it is undefined and return true.
> >> If VAL
> >> +// is a pointer which can be zero return false. If VAL is of another
> >> type, add
> >> +// the constant to the set of ranges to R and return true.
> >> +
> >> +static bool
> >> +add_loaded_invariant_to_range (vrange &r, tree val)
> >> +{
> >> + if (POINTER_TYPE_P (TREE_TYPE (val)))
> >> + {
> >> + bool strict_overflow_p;
> >> + if (tree_single_nonzero_warnv_p (val, &strict_overflow_p))
> >> + {
> >> + if (r.undefined_p ())
> >> + r.set_nonzero (TREE_TYPE (val));
> >> + return true;
> >> + }
> >> + else
> >> + return false;
> >> + }
> >> + else
> >> + {
> >> + if (TREE_CODE (val) == REAL_CST
> >> + && real_isnan (TREE_REAL_CST_PTR (val)))
> >> + return false;
> >> +
> >> + if (is_a <irange> (r))
> >> + {
> >> + irange &ir = as_a <irange> (r);
> >> + if (!r.contains_p (val)
> >> + && ir.num_pairs () >= (unsigned) param_vrp_cstload_limit)
> >> + return false;
> >> + }
> >> +
> >> + value_range tmp (val, val, VR_RANGE);
> >> + if (r.undefined_p ())
> >> + r = tmp;
> >> + else
> >> + r.union_ (tmp);
> >> + return true;
> >> + }
> >> +}
> >> +
> >> +// One step of fold_using_range::range_from_readonly_var. Process
> >> expressions
> >> +// in comps from index I to 0 according to the corresponding static
> >> initializer
> >> +// in CONSTUCTOR, adding all possible loaded constants (which must be
> >> storable
> >> +// to TYPE) to R. Return true if the all necessary values were extracted
> >> and
> >> +// added to R successfully.
> >> +
> >> +static bool
> >> +range_from_readonly_load (vrange &r, tree type, tree constructor,
> >> + const vec <tree> &comps, int i)
> >> +{
> >> + unsigned ix;
> >> + tree index, val;
> >> + tree expr = comps[i];
> >> +
> >> + gcc_assert (TREE_CODE (expr) == COMPONENT_REF
> >> + || TREE_CODE (expr) == ARRAY_REF);
> >> + tree op1 = TREE_OPERAND (expr, 1);
> >
> > Please use get_array_ctor_element_at_index for constant indexing into
> > an ARRAY_REF.
>
> Will do, thanks for the pointer.
>
> > I think you should be able to use fold_const_aggregate_ref
> > to get you the CTOR for the "inner" part when you pass it the parent of the
> > single variable ref.
>
> The patch can handle more than one variable ARRAY_REF. And I think it
> should be able to handle accesses to small MxN arrays with both indices
> unknown too. My commit message might not have been clear about this, I
> will adjust it and add a testcase for a 2x2 array.
I see.
> > Then you can use the same function to fold you the
> > outer (also constant refs) by tacking the resulting CTOR as base to the
> > outer ref with the constant interesting index.
> >
> > IMO the recursion doesn't help readability.
>
> In order to support more than one ARRAY_REF it is necessary, I'm afraid
> (Or emulating it with a stack, of course).
OK, but you have a collected vector of "indexes" already, to walk "backwards",
so it felt a bit odd, but with multiple variable indices that's explainable.
> >
> > I think the param, instead of limiting the range complexity, needs to
> > limit the amount
> > of CTOR walking done. Otherwise we'd scan a multi-gigabyte two-valued CTOR
> > like {0, 1, 0, 1 ..} with arr[i] many many times.
>
> Right. Do we want both? While it probably makes sense to traverse an
> 5x5 array containing ones and zeros, I am not sure if harm can be done
> by building ranges with 25 pairs if the values are unique? But perhaps
> not.
I think we want to have a quite low limit on the walk-the-CTOR complexity,
For value-range there's a HARD_MAX_RANGES of 255 currently, it will
simply degrade range precision after hitting that.
> > Having a CTOR pre-classified by IPA would be nice.
>
> I guess we could cache the results and keep them for all functions, but
> this is probably something for later.
>
> >> + FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (constructor), ix, index,
> >> val)
> >> + {
> >> + /* TODO: If the array index in the expr is an SSA_NAME with a known
> >> + range, we could use just values loaded from the corresponding
> >> array
> >> + elements. */
> >
> > Until you do this this really boils down to the complete static CTOR range
> > which
> > should be easier to cache than the VR, CTOR pair you'd have to use
> > otherwise.
>
> I think caching the queries for particular known fields of RECORED_REFs
> and/or particular known indices of ARRAY_REFs would not be much more
> difficult.
True. I was more thinking of handling only the case of a uniformly typed
(as in leaf-elements) variable with CTOR (like arrays) where we could cache
the overall value-range [approximation] for the whole CTOR.
> In any event, thanks for looking at this,
>
> Martin