On Wed, Oct 25, 2023 at 5:51 AM Andrew Pinski <pins...@gmail.com> wrote: > > I noticed we were missing optimizing `a / (1 << b)` when > we know that a is nonnegative but only due to ranger information. > This adds the use of the global ranger to tree_single_nonnegative_warnv_p > for SSA_NAME. > I didn't extend tree_single_nonnegative_warnv_p to use the ranger for floating > point nor to use the local ranger since I am not 100% sure it is safe where > all of the uses tree_expr_nonnegative_p would be safe. > > Note pr80776-1.c testcase fails again due to vrp's bad handling of setting > global ranges from __builtin_unreachable. It just happened to be optimized > before due to global ranges not being used as much. > > Bootstrapped and tested on x86_64-linux-gnu with no regressions. > > PR tree-optimization/111959 > > gcc/ChangeLog: > > * fold-const.cc (tree_single_nonnegative_warnv_p): Use > the global range to see if the SSA_NAME was nonnegative. > > gcc/testsuite/ChangeLog: > > * gcc.dg/tree-ssa/forwprop-42.c: New test. > * gcc.dg/pr80776-1.c: xfail and update comment. > --- > gcc/fold-const.cc | 36 +++++++++++++++------ > gcc/testsuite/gcc.dg/pr80776-1.c | 8 ++--- > gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c | 15 +++++++++ > 3 files changed, 46 insertions(+), 13 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c > > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc > index 40767736389..2a2a90230f5 100644 > --- a/gcc/fold-const.cc > +++ b/gcc/fold-const.cc > @@ -15047,15 +15047,33 @@ tree_single_nonnegative_warnv_p (tree t, bool > *strict_overflow_p, int depth) > return RECURSE (TREE_OPERAND (t, 1)) && RECURSE (TREE_OPERAND (t, 2)); > > case SSA_NAME: > - /* Limit the depth of recursion to avoid quadratic behavior. > - This is expected to catch almost all occurrences in practice. > - If this code misses important cases that unbounded recursion > - would not, passes that need this information could be revised > - to provide it through dataflow propagation. */ > - return (!name_registered_for_update_p (t) > - && depth < param_max_ssa_name_query_depth > - && gimple_stmt_nonnegative_warnv_p (SSA_NAME_DEF_STMT (t), > - strict_overflow_p, depth)); > + { > + /* For integral types, querry the global range if possible. */
query > + if (INTEGRAL_TYPE_P (TREE_TYPE (t))) > + { > + value_range vr; > + if (get_global_range_query ()->range_of_expr (vr, t) > + && !vr.varying_p () && !vr.undefined_p ()) > + { > + /* If the range is nonnegative, return true. */ > + if (vr.nonnegative_p ()) > + return true; > + > + /* If the range is non-positive, then return false. */ > + if (vr.nonpositive_p ()) > + return false; That's testing for <= 0, nonnegative for >= 0. This means when vr.nonpositive_p () the value could still be zero (and nonnegative), possibly be figured out by the recursion below. Since we don't have negative_p () do we want to test nonpositive_p () && nonzero_p () instead? OK with that change. Richard. > + } > + } > + /* Limit the depth of recursion to avoid quadratic behavior. > + This is expected to catch almost all occurrences in practice. > + If this code misses important cases that unbounded recursion > + would not, passes that need this information could be revised > + to provide it through dataflow propagation. */ > + return (!name_registered_for_update_p (t) > + && depth < param_max_ssa_name_query_depth > + && gimple_stmt_nonnegative_warnv_p (SSA_NAME_DEF_STMT (t), > + strict_overflow_p, > depth)); > + } > > default: > return tree_simple_nonnegative_warnv_p (TREE_CODE (t), TREE_TYPE (t)); > diff --git a/gcc/testsuite/gcc.dg/pr80776-1.c > b/gcc/testsuite/gcc.dg/pr80776-1.c > index b9bce62d982..f3d47aeda36 100644 > --- a/gcc/testsuite/gcc.dg/pr80776-1.c > +++ b/gcc/testsuite/gcc.dg/pr80776-1.c > @@ -18,14 +18,14 @@ Foo (void) > if (! (0 <= i && i <= 999999)) > __builtin_unreachable (); > > - /* Legacy evrp sets the range of i to [0, MAX] *before* the first > conditional, > + /* vrp1 sets the range of i to [0, MAX] *before* the first conditional, > and to [0,999999] *before* the second conditional. This is because both > - evrp and VRP use trickery to set global ranges when this particular use > of > + vrp use trickery to set global ranges when this particular use of > a __builtin_unreachable is in play (see uses of > assert_unreachable_fallthru_edge_p). > > - Setting these ranges at the definition site, causes VRP to remove the > + Setting these ranges at the definition site, causes other passes to > remove the > unreachable code altogether, leaving the following sprintf unguarded. > This > causes the bogus warning below. */ > - sprintf (number, "%d", i); /* { dg-bogus "writing" "" } */ > + sprintf (number, "%d", i); /* { dg-bogus "writing" "" { xfail *-*-* } } */ > } > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c > b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c > new file mode 100644 > index 00000000000..4e5421ed4d4 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c > @@ -0,0 +1,15 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-optimized" } */ > +/* PR tree-optimization/111959 */ > + > +int divbypow2(int a, int b) > +{ > + if (a & ~0xff) __builtin_unreachable(); > + return a / (1<<b); > +} > + > +/* divbypow2 should be able to optimize to just a/b as a is known to be > always positive. */ > +/* { dg-final { scan-tree-dump-not " / " "optimized" } } */ > +/* { dg-final { scan-tree-dump-not " << " "optimized" } } */ > +/* { dg-final { scan-tree-dump-times " >> " 1 "optimized" } } */ > + > -- > 2.39.3 >