On Wed, May 3, 2017 at 5:32 PM, Jeff Law <l...@redhat.com> wrote: > [ With the patch attached... ] > > > On 05/03/2017 10:31 AM, Jeff Law wrote: >> >> This is the first of 3-5 patches to address pr78496. >> >> The goal of these patches is to catch jump threads earlier in the pipeline >> to avoid undesirable behavior in PRE and more generally be able to exploit >> the secondary opportunities exposed by jump threading. >> >> One of the more serious issues I found while investigating 78496 was VRP >> failing to find what should have been obvious jump threads. The fundamental >> issue is VRP will simplify conditionals which are fed by a typecast prior to >> jump threading. So something like this: >> >> x = (typecast) y; >> if (x == 42) >> >> Can often be transformed into: >> >> if (y == 42) >> >> >> The problem is any ASSERT_EXPRS after the conditional will reference "x" >> rather than "y". That in turn makes it impossible for VRP to use those >> ASSERT_EXPRs to thread later jumps that use x == <whatever> >> >> >> More concretely consider this gimple code: >> >> >> ;; basic block 5, loop depth 0, count 0, freq 10000, maybe hot >> ;; prev block 4, next block 12, flags: (NEW, REACHABLE, VISITED) >> ;; pred: 3 [50.0%] (TRUE_VALUE,EXECUTABLE) >> ;; 4 [100.0%] (FALLTHRU,EXECUTABLE) >> # iftmp.0_2 = PHI <1(3), 0(4)> >> in_loop_7 = (unsigned char) iftmp.0_2; >> if (in_loop_7 != 0) >> goto <bb 6>; [33.00%] >> else >> goto <bb 12>; [67.00%] >> >> ;; succ: 6 [33.0%] (TRUE_VALUE,EXECUTABLE) >> ;; 12 [67.0%] (FALSE_VALUE,EXECUTABLE) >> >> ;; basic block 12, loop depth 0, count 0, freq 6700, maybe hot >> ;; prev block 5, next block 6, flags: (NEW) >> ;; pred: 5 [67.0%] (FALSE_VALUE,EXECUTABLE) >> in_loop_15 = ASSERT_EXPR <in_loop_7, in_loop_7 == 0>; >> goto <bb 7>; [100.00%] >> ;; succ: 7 [100.0%] (FALLTHRU) >> >> ;; basic block 6, loop depth 0, count 0, freq 3300, maybe hot >> ;; prev block 12, next block 7, flags: (NEW, REACHABLE, VISITED) >> ;; pred: 5 [33.0%] (TRUE_VALUE,EXECUTABLE) >> in_loop_14 = ASSERT_EXPR <in_loop_7, in_loop_7 != 0>; >> simple_iv (); >> ;; succ: 7 [100.0%] (FALLTHRU,EXECUTABLE) >> >> And later we have: >> >> ;; basic block 9, loop depth 0, count 0, freq 8476, maybe hot >> ;; prev block 8, next block 10, flags: (NEW, REACHABLE, VISITED) >> ;; pred: 7 [84.8%] (FALSE_VALUE,EXECUTABLE) >> if (in_loop_7 == 0) >> goto <bb 10>; [36.64%] >> else >> goto <bb 11>; [63.36%] >> >> VRP knows it can replace the uses of in_loop_7 in the conditionals in >> blocks 5 and 9 with iftmp.0_2 and happily does so *before* jump threading >> (but well after ASSERT_EXPR insertion). >> >> As a result VRP is unable to utilize the ASSERT_EXPRs in blocks 12 and 6 >> (which reference in_loop_7) to thread the jump at bb9 (which now references >> iftmp.0_2). >> >> >> The cases in pr78496 are slightly more complex, but boil down to the same >> core issue -- simplifying the conditional too early. >> >> Thankfully this is easy to fix. We just split the conditional >> simplification into two steps so that the transformation noted above occurs >> after jump threading (the other simplifications we want to occur before jump >> threading). >> >> This allows VRP1 to pick up 27 missed jump threads in the testcase from >> 78496. It could well be enough to address 78496, but since we don't have a >> solid description of the desired end result I won't consider 78496 fixed >> quite yet as there's significant further improvements we can make. >> >> Bootstrapped and regression tested on x86_64-linux-gnu. Installing on the >> trunk. >> >> Jeff >> > > > diff --git a/gcc/ChangeLog b/gcc/ChangeLog > index 92a4e395ba8..32cc81978df 100644 > --- a/gcc/ChangeLog > +++ b/gcc/ChangeLog > @@ -1,3 +1,14 @@ > +2017-05-03 Jeff Law <l...@redhat.com> > + > + PR tree-optimization/78496 > + * tree-vrp.c (simplify_cond_using_ranges_1): Renamed > + from simplify_cond_using_ranges. Split off code to walk > + backwards through casts into ... > + (simplify_cond_using_ranges_2): New function. > + (simplify_stmt_using_ranges): Call simplify_cond_using_ranges_1. > + (execute_vrp): After identifying jump threads, call > + simplify_cond_using_ranges_2. > + > 2017-05-03 Jeff Downs <heydo...@somuchpressure.net> > Rainer Orth <r...@cebitec.uni-bielefeld.de> > > diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog > index 596468d33e9..55a44e4635a 100644 > --- a/gcc/testsuite/ChangeLog > +++ b/gcc/testsuite/ChangeLog > @@ -1,3 +1,8 @@ > +2017-05-03 Jeff Law <l...@redhat.com> > + > + PR tree-optimization/78496 > + * gcc.dg/tree-ssa/ssa-thread-15.c: New test. > + > 2017-05-03 Uros Bizjak <ubiz...@gmail.com> > > * g++.dg/lto/pr79671_0.C (foo): Fix asm constraints. > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-15.c > b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-15.c > new file mode 100644 > index 00000000000..f73268cacbb > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-15.c > @@ -0,0 +1,51 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-vrp1" } */ > + > +/* We should thread the if (!in_loop) completely leaving > + just two conditionals. */ > +/* { dg-final { scan-tree-dump-times "if \\(" 2 "vrp1" } } */ > + > + > +union tree_node; > +typedef union tree_node *tree; > + > +enum size_type_kind > +{ > + SIZETYPE, > + SSIZETYPE, > + BITSIZETYPE, > + SBITSIZETYPE, > + TYPE_KIND_LAST > +}; > +extern tree size_int_kind (long, enum size_type_kind); > + > + > + > +typedef struct > +{ > + > + tree base, step; > + > +} affine_iv; > + > +struct loop > +{ > + > + int num; > +}; > +extern unsigned char simple_iv (); > + > +unsigned char > +dr_analyze_innermost (struct loop *loop, tree poffset) > +{ > + affine_iv offset_iv; > + unsigned char in_loop = (loop && loop->num); > + > + > + if (in_loop) > + simple_iv (); > + > + if (!in_loop) > + offset_iv.step = size_int_kind (0, SSIZETYPE); > + > +} > diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c > index df5e4267491..6a4035dc924 100644 > --- a/gcc/tree-vrp.c > +++ b/gcc/tree-vrp.c > @@ -9977,7 +9977,7 @@ range_fits_type_p (value_range *vr, unsigned > dest_precision, signop dest_sgn) > the original conditional. */ > > static bool > -simplify_cond_using_ranges (gcond *stmt) > +simplify_cond_using_ranges_1 (gcond *stmt) > { > tree op0 = gimple_cond_lhs (stmt); > tree op1 = gimple_cond_rhs (stmt); > @@ -10080,6 +10080,22 @@ simplify_cond_using_ranges (gcond *stmt) > } > } > } > + return false; > +} > + > +/* STMT is a conditional at the end of a basic block. > + > + If the conditional is of the form SSA_NAME op constant and the SSA_NAME > + was set via a type conversion, try to replace the SSA_NAME with the RHS > + of the type conversion. Doing so makes the conversion dead which helps > + subsequent passes. */ > + > +static void > +simplify_cond_using_ranges_2 (gcond *stmt) > +{ > + tree op0 = gimple_cond_lhs (stmt); > + tree op1 = gimple_cond_rhs (stmt); > + enum tree_code cond_code = gimple_cond_code (stmt); > > /* If we have a comparison of an SSA_NAME (OP0) against a constant, > see if OP0 was set by a type conversion where the source of > @@ -10097,7 +10113,7 @@ simplify_cond_using_ranges (gcond *stmt) > > if (!is_gimple_assign (def_stmt) > || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def_stmt))) > - return false; > + return; > > innerop = gimple_assign_rhs1 (def_stmt); > > @@ -10142,12 +10158,16 @@ simplify_cond_using_ranges (gcond *stmt) > tree newconst = fold_convert (TREE_TYPE (innerop), op1); > gimple_cond_set_lhs (stmt, innerop); > gimple_cond_set_rhs (stmt, newconst); > - return true; > + update_stmt (stmt); > + if (dump_file && (dump_flags & TDF_DETAILS)) > + { > + fprintf (dump_file, "Folded into: "); > + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); > + fprintf (dump_file, "\n"); > + } > } > } > } > - > - return false; > } > > /* Simplify a switch statement using the value range of the switch > @@ -10746,7 +10766,7 @@ simplify_stmt_using_ranges (gimple_stmt_iterator > *gsi) > } > } > else if (gimple_code (stmt) == GIMPLE_COND) > - return simplify_cond_using_ranges (as_a <gcond *> (stmt)); > + return simplify_cond_using_ranges_1 (as_a <gcond *> (stmt)); > else if (gimple_code (stmt) == GIMPLE_SWITCH) > return simplify_switch_using_ranges (as_a <gswitch *> (stmt)); > else if (is_gimple_call (stmt) > @@ -11759,6 +11779,21 @@ execute_vrp (bool warn_array_bounds_p) > the datastructures built by VRP. */ > identify_jump_threads (); > > + /* A comparison of an SSA_NAME against a constant where the SSA_NAME > + was set by a type conversion can often be rewritten to use the > + RHS of the type conversion. > + > + However, doing so inhibits jump threading through the comparison. > + So that transformation is not performed until after jump threading > + is complete. */ > + basic_block bb; > + FOR_EACH_BB_FN (bb, cfun) > + { > + gimple *last = last_stmt (bb); > + if (last && gimple_code (last) == GIMPLE_COND) > + simplify_cond_using_ranges_2 (as_a <gcond *> (last)); > + } > + Hi, I can see the original function is split into two and now executed at different stages, one question is: In the 2nd case of the original function, seems stmt doesn't need to be the last statement of a basic block. After change, simplify_cond_using_ranges_2 is only fed with basic block's last statement. Is it possible to miss some gcond stmt in the middle?
Thanks, bin > vrp_free_lattice (); > > free_numbers_of_iterations_estimates (cfun); >