On Mon, Dec 4, 2017 at 6:59 AM, Jeff Law <l...@redhat.com> wrote: > And finally, the meat of the fix for pr78496. This is largely the patch > that was posed right when stage1 closed with just minor updates. > > This patch embeds evrp analysis into DOM and exploits it in the minimal > way necessary to fix pr78496 without regressing other tests in the > testsuite. > > The key effect we're looking for here is to pick up a slew of jump > threads in DOM2 by exploiting the range information during jump threading. > > These aren't handled well in the standard tree-vrp jump threading -- we > could abuse ASSERT_EXPRs further, but it'd just be ugly and probably > computationally expensive. > > The ranges we want fall out naturally from a dominator walk, hence all > the recent work around pulling out EVRP analysis into a little module > other passes could reuse. > > With these changes we pick up all the missing jump thread opportunities > in DOM for pr78496. Sadly, I've been able to pull together an automated > test that I like. About the best I could come up with would be to > verify that a large number of jump threads were realized during DOM2. > But that still seems rather fragile. I also looked at examining the > results of PRE, but the partial redundancies that were originally the > source of complaints in that BZ have already been fixed. I also looked > at perhaps looking for PHIs with constant args and then perhaps trying > to filter those, but got nowhere. > > So there's no direct test for pr78496. Sigh. > > > > Running EVRP analysis in DOM had an unintended side effect. Namely that > SSA_NAMEs that got created after the EVRP optimization pass would have > range information attached to them. That caused two warning regressions > with -O1 code in the C++ and Fortran testsuites. The problem is the > ranges attached to the new SSA_NAMES were very wide and there was code > left on an unexecutable path that called an allocation routine (C++) and > a memset (Fortran) using those names as arguments. The uber-wide ranges > in those circumstances triggered the false positive warnings. > > By exploiting the EVRP data during the standard part of DOM to optimize > conditionals, we're able to prove the paths in question simply aren't > executable. We remove them and the warnings go away. > > This work compromises builtin-unreachable-6. So the original test it > kept and -fno-tree-dominator-opts is added to keep it from being > compromised. A new builtin-unreachable-6a test is created to very that > DOM does indeed remove all the __builtin_unreachable paths. > > This work also results in us optimizing 20030922-2.c again (conditional > equivalences). EVRP analysis will note the conditional equivalence. We > don't propagate anything from EVRP analysis at this time, but we do use > it to try and simplify GIMPLE_CONDs to a compile-time constant which is > what happens in this case. > > I plan to check this in if/when the first two patches are approved.
Looks good to me. Btw, did you look at adding a GIMPLE testcase for 78496? You'd have stable IL into DOM then... Richard. > Jeff > > ps. The x_vr_values hack is scheduled to go away as we remove > tree-vrp.c's threading implementation in gcc-9 -- we'll be able to drop > the simplification callback and instead have simplification live within > the dom_opt_dom_walker class where it'll have access to vr_values. The > analogous version in tree-vrp.c just gets deleted at that time. > > pps. I know there's a bogus propagation patch that I need to go back and > re-check based on comments from Richi. That's definitely in the queue. > > > > * gimple-ssa-evrp-analyze.h > (evrp_range_analyzer::get_vr_values): Simplify. > * gimple-ssa-evrp-analyze.c: Corresponding changes. > > * tree-ssa-dom.c: Include alloc-pool.h, tree-vrp.h, vr-values.h > and gimple-ssa-evrp-analyze.h. > (dom_opt_dom_walker class): Add evrp_range_analyzer member. > (simplify_stmt_for_jump_threading): Copy a blob of code from > tree-vrp.c to use ranges to simplify statements. > (dom_opt_dom_walker::before_dom_children): Call > evrp_range_analyzer::{enter,record_ranges_from_stmt} methods. > (dom_opt_dom_walker::after_dom_children): Similarly for > evrp_range_analyzer::leave. > (dom_opt_dom_walker::optimize_stmt): Use EVRP ranges to optimize > conditionals. > > * gcc.dg/builtin-unreachable-6.c: Disable DOM. > * gcc.dg/builtin-unreachable-6a.c: New test. > * gcc.dg/tree-ssa/20030922-1.c: No longer XFAIL. > * gcc.dg/ssa-dom-branch-1.c: Tweak expected output. > > diff --git a/gcc/gimple-ssa-evrp-analyze.h b/gcc/gimple-ssa-evrp-analyze.h > index 6216a90..4783e6f 100644 > --- a/gcc/gimple-ssa-evrp-analyze.h > +++ b/gcc/gimple-ssa-evrp-analyze.h > @@ -51,13 +51,7 @@ class evrp_range_analyzer > true, then we are transferring ownership. Else we keep ownership. > > This should be converted to a unique_ptr. */ > - class vr_values *get_vr_values (bool transfer) > - { > - class vr_values *x = vr_values; > - if (transfer) > - vr_values = NULL; > - return x; > - } > + class vr_values *get_vr_values (void) { return vr_values; } > > private: > DISABLE_COPY_AND_ASSIGN (evrp_range_analyzer); > diff --git a/gcc/gimple-ssa-evrp.c b/gcc/gimple-ssa-evrp.c > index a554cf9..609ce38 100644 > --- a/gcc/gimple-ssa-evrp.c > +++ b/gcc/gimple-ssa-evrp.c > @@ -68,7 +68,7 @@ class evrp_dom_walker : public dom_walker > public: > evrp_dom_walker () > : dom_walker (CDI_DOMINATORS), > - evrp_folder (evrp_range_analyzer.get_vr_values (false)) > + evrp_folder (evrp_range_analyzer.get_vr_values ()) > { > need_eh_cleanup = BITMAP_ALLOC (NULL); > } > diff --git a/gcc/testsuite/gcc.dg/builtin-unreachable-6.c > b/gcc/testsuite/gcc.dg/builtin-unreachable-6.c > index 2f8ca36..1915dd1 100644 > --- a/gcc/testsuite/gcc.dg/builtin-unreachable-6.c > +++ b/gcc/testsuite/gcc.dg/builtin-unreachable-6.c > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-O2 -fdump-tree-fab1" } */ > +/* { dg-options "-O2 -fdump-tree-fab1 -fno-tree-dominator-opts" } */ > > void > foo (int b, int c) > diff --git a/gcc/testsuite/gcc.dg/builtin-unreachable-6a.c > b/gcc/testsuite/gcc.dg/builtin-unreachable-6a.c > new file mode 100644 > index 0000000..f527f2e > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/builtin-unreachable-6a.c > @@ -0,0 +1,7 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-fab1" } */ > + > +#include "builtin-unreachable-6.c" > + > +/* { dg-final { scan-tree-dump-times "lab:" 1 "fab1" } } */ > +/* { dg-final { scan-tree-dump-not "__builtin_unreachable" "fab1" } } */ > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c > b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c > index 172f203..16c79da 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c > @@ -20,6 +20,4 @@ rgn_rank (rtx insn1, rtx insn2) > } > > /* There should be two IF conditionals. */ > -/* We no longer record the conditional equivalence by design, thus we > - are unable to simplify the 3rd conditional at compile time. */ > -/* { dg-final { scan-tree-dump-times "if " 2 "dom2" { xfail *-*-* } } } */ > +/* { dg-final { scan-tree-dump-times "if " 2 "dom2" } } */ > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-branch-1.c > b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-branch-1.c > index 3c15296..fae5bde 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-branch-1.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-branch-1.c > @@ -19,9 +19,10 @@ try_combine (rtx i1, rtx newpat) > else if (i1 && foo ()); > } > > -/* There should be two tests against i1. One from the hash table > - dumps, one in the code itself. */ > -/* { dg-final { scan-tree-dump-times "if .i1_" 2 "dom2"} } */ > +/* There should be four tests against i1. One from the hash table > + dumps, one from the EVRP analyzer one from EVRP evaluation and one > + in the code itself. */ > +/* { dg-final { scan-tree-dump-times "if .i1_" 4 "dom2"} } */ > > /* There should be no actual jump threads realized by DOM. The > legitimize jump threads are handled in VRP and those discovered > diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c > index 916d661..59a7d87 100644 > --- a/gcc/tree-ssa-dom.c > +++ b/gcc/tree-ssa-dom.c > @@ -46,6 +46,10 @@ along with GCC; see the file COPYING3. If not see > #include "gimplify.h" > #include "tree-cfgcleanup.h" > #include "dbgcnt.h" > +#include "alloc-pool.h" > +#include "tree-vrp.h" > +#include "vr-values.h" > +#include "gimple-ssa-evrp-analyze.h" > > /* This file implements optimizations on the dominator tree. */ > > @@ -584,6 +588,9 @@ private: > class const_and_copies *m_const_and_copies; > class avail_exprs_stack *m_avail_exprs_stack; > > + /* VRP data. */ > + class evrp_range_analyzer evrp_range_analyzer; > + > /* Dummy condition to avoid creating lots of throw away statements. */ > gcond *m_dummy_cond; > > @@ -835,6 +842,9 @@ make_pass_dominator (gcc::context *ctxt) > return new pass_dominator (ctxt); > } > > +/* A hack until we remove threading from tree-vrp.c and bring the > + simplification routine into the dom_opt_dom_walker class. */ > +static class vr_values *x_vr_values; > > /* A trivial wrapper so that we can present the generic jump > threading code with a simple API for simplifying statements. */ > @@ -844,7 +854,95 @@ simplify_stmt_for_jump_threading (gimple *stmt, > class avail_exprs_stack *avail_exprs_stack, > basic_block bb ATTRIBUTE_UNUSED) > { > - return avail_exprs_stack->lookup_avail_expr (stmt, false, true); > + /* First query our hash table to see if the the expression is available > + there. A non-NULL return value will be either a constant or another > + SSA_NAME. */ > + tree cached_lhs = avail_exprs_stack->lookup_avail_expr (stmt, false, > true); > + if (cached_lhs) > + return cached_lhs; > + > + /* If the hash table query failed, query VRP information. This is > + essentially the same as tree-vrp's simplification routine. The > + copy in tree-vrp is scheduled for removal in gcc-9. */ > + if (gcond *cond_stmt = dyn_cast <gcond *> (stmt)) > + { > + cached_lhs > + = x_vr_values->vrp_evaluate_conditional (gimple_cond_code (cond_stmt), > + gimple_cond_lhs (cond_stmt), > + gimple_cond_rhs (cond_stmt), > + within_stmt); > + return cached_lhs; > + } > + > + if (gswitch *switch_stmt = dyn_cast <gswitch *> (stmt)) > + { > + tree op = gimple_switch_index (switch_stmt); > + if (TREE_CODE (op) != SSA_NAME) > + return NULL_TREE; > + > + value_range *vr = x_vr_values->get_value_range (op); > + if ((vr->type != VR_RANGE && vr->type != VR_ANTI_RANGE) > + || symbolic_range_p (vr)) > + return NULL_TREE; > + > + if (vr->type == VR_RANGE) > + { > + size_t i, j; > + > + find_case_label_range (switch_stmt, vr->min, vr->max, &i, &j); > + > + if (i == j) > + { > + tree label = gimple_switch_label (switch_stmt, i); > + > + if (CASE_HIGH (label) != NULL_TREE > + ? (tree_int_cst_compare (CASE_LOW (label), vr->min) <= 0 > + && tree_int_cst_compare (CASE_HIGH (label), vr->max) >= > 0) > + : (tree_int_cst_equal (CASE_LOW (label), vr->min) > + && tree_int_cst_equal (vr->min, vr->max))) > + return label; > + > + if (i > j) > + return gimple_switch_label (switch_stmt, 0); > + } > + } > + > + if (vr->type == VR_ANTI_RANGE) > + { > + unsigned n = gimple_switch_num_labels (switch_stmt); > + tree min_label = gimple_switch_label (switch_stmt, 1); > + tree max_label = gimple_switch_label (switch_stmt, n - 1); > + > + /* The default label will be taken only if the anti-range of the > + operand is entirely outside the bounds of all the > (non-default) > + case labels. */ > + if (tree_int_cst_compare (vr->min, CASE_LOW (min_label)) <= 0 > + && (CASE_HIGH (max_label) != NULL_TREE > + ? tree_int_cst_compare (vr->max, CASE_HIGH (max_label)) > >= 0 > + : tree_int_cst_compare (vr->max, CASE_LOW (max_label)) > >= 0)) > + return gimple_switch_label (switch_stmt, 0); > + } > + return NULL_TREE; > + } > + > + if (gassign *assign_stmt = dyn_cast <gassign *> (stmt)) > + { > + tree lhs = gimple_assign_lhs (assign_stmt); > + if (TREE_CODE (lhs) == SSA_NAME > + && (INTEGRAL_TYPE_P (TREE_TYPE (lhs)) > + || POINTER_TYPE_P (TREE_TYPE (lhs))) > + && stmt_interesting_for_vrp (stmt)) > + { > + edge dummy_e; > + tree dummy_tree; > + value_range new_vr = VR_INITIALIZER; > + x_vr_values->extract_range_from_stmt (stmt, &dummy_e, > + &dummy_tree, &new_vr); > + if (range_int_cst_singleton_p (&new_vr)) > + return new_vr.min; > + } > + } > + return NULL; > } > > /* Valueize hook for gimple_fold_stmt_to_constant_1. */ > @@ -1310,6 +1408,8 @@ dom_opt_dom_walker::before_dom_children (basic_block bb) > if (dump_file && (dump_flags & TDF_DETAILS)) > fprintf (dump_file, "\n\nOptimizing block #%d\n\n", bb->index); > > + evrp_range_analyzer.enter (bb); > + > /* Push a marker on the stacks of local information so that we know how > far to unwind when we finalize this block. */ > m_avail_exprs_stack->push_marker (); > @@ -1332,7 +1432,10 @@ dom_opt_dom_walker::before_dom_children (basic_block > bb) > > edge taken_edge = NULL; > for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) > - taken_edge = this->optimize_stmt (bb, gsi); > + { > + evrp_range_analyzer.record_ranges_from_stmt (gsi_stmt (gsi)); > + taken_edge = this->optimize_stmt (bb, gsi); > + } > > /* Now prepare to process dominated blocks. */ > record_edge_info (bb); > @@ -1350,13 +1453,16 @@ dom_opt_dom_walker::before_dom_children (basic_block > bb) > void > dom_opt_dom_walker::after_dom_children (basic_block bb) > { > + x_vr_values = evrp_range_analyzer.get_vr_values (); > thread_outgoing_edges (bb, m_dummy_cond, m_const_and_copies, > m_avail_exprs_stack, > simplify_stmt_for_jump_threading); > + x_vr_values = NULL; > > /* These remove expressions local to BB from the tables. */ > m_avail_exprs_stack->pop_to_marker (); > m_const_and_copies->pop_to_marker (); > + evrp_range_analyzer.leave (bb); > } > > /* Search for redundant computations in STMT. If any are found, then > @@ -1902,6 +2008,31 @@ dom_opt_dom_walker::optimize_stmt (basic_block bb, > gimple_stmt_iterator si) > integer_zero_node)); > gimple_set_modified (stmt, true); > } > + else if (TREE_CODE (lhs) == SSA_NAME) > + { > + /* Exploiting EVRP data is not yet fully integrated into DOM > + but we need to do something for this case to avoid regressing > + udr4.f90 and new1.C which have unexecutable blocks with > + undefined behavior that get diagnosed if they're left in the > + IL because we've attached range information to new > + SSA_NAMES. */ > + edge taken_edge = NULL; > + evrp_range_analyzer.vrp_visit_cond_stmt (as_a <gcond *> (stmt), > + &taken_edge); > + if (taken_edge) > + { > + if (taken_edge->flags & EDGE_TRUE_VALUE) > + gimple_cond_make_true (as_a <gcond *> (stmt)); > + else if (taken_edge->flags & EDGE_FALSE_VALUE) > + gimple_cond_make_false (as_a <gcond *> (stmt)); > + else > + gcc_unreachable (); > + gimple_set_modified (stmt, true); > + update_stmt (stmt); > + cfg_altered = true; > + return taken_edge; > + } > + } > } > > update_stmt_if_modified (stmt); >