On Mon, 27 Jun 2016, Jan Hubicka wrote: > Hi, > this patch makes simple_iv to determine more often that IV can not overflow. > First I commonized the logic in simple_iv with nowrap_type_p because it tests > the same. Second I added iv_can_overflow_p which uses known upper bound on > number of iteration to see if the IV calculation can overflow.
+ if (!get_max_loop_iterations (loop, &nit)) + /* 10GHz CPU runing one cycle loop will reach 2^60 iterations in 260 + years. I won't live long enough to be forced to fix the + miscompilation. Having the limit here will let us to consider + 64bit IVs with base 0 and step 1...16 as non-wrapping which makes + niter and ivopts go smoother. */ + nit = ((widest_int)1 << 60); I think that's on the border of being acceptable. Consider said loop being autoparallelized and run on a PFlop cluster. Please take it out for now. + if (INTEGRAL_TYPE_P (type)) + { + maxt = TYPE_MAX_VALUE (type); + mint = TYPE_MIN_VALUE (type); + } + else + { + maxt = upper_bound_in_type (type, type); + mint = lower_bound_in_type (type, type); + } I think we don't support any non-integral/pointer IVs and thus you can simply use type_min/max = wi::max/min_value (TYPE_PRECISION (type), TYPE_SIGN (type)); + type_min = wi::to_widest (mint); + type_max = wi::to_widest (maxt); + if ((base_max + step_max * (nit + 1)) > (type_max) + || type_min > (base_min + step_min * (nit + 1))) + return true; so I'm not convinced that widest_int precision is enough here. Can't you use wide_ints instead of widest_ints and use the wi::add / wi::mult overloads which provide the overflow flag? Otherwise do what VRP does and use FIXED_WIDE_INT to properly handle __int128_t IVs. > One interesting thig is that the ivcanon IV variables that goes from niter to > 0 > are believed to be wrapping. This is because the type is unsigned and -1 is > then large number. > > It is not specified what overflow means, I suppose one can think of it as > overflow > in the calucaltion what sort of happens in this case. > Inspecting the code I think both users (niter and ivopts) agrees with > different > interpretation that the induction can not wrap: that is the sequence produced > is always monotonously increasing or decreasing. That makes sense. > I would suggest to incrementally rename it to no_wrap_p and add comment in > this > sense and handle this case, too. It will need update at one place in ivopts > where > we are detemrining the direction of iteration: > > /* We need to know that the candidate induction variable does not overflow. > While more complex analysis may be used to prove this, for now just > check that the variable appears in the original program and that it > is computed in a type that guarantees no overflows. */ > cand_type = TREE_TYPE (cand->iv->base); > if (cand->pos != IP_ORIGINAL || !nowrap_type_p (cand_type)) > return false; > .. > step = int_cst_value (cand->iv->step); > > ... > /* Determine the new comparison operator. */ > > comp = step < 0 ? GT_EXPR : LT_EXPR; > > > (which is the only occurence of step). I guess we can add function > iv_direction > that will return -1,0,1 for monotonously decreasing, constant/unknown and > monotonously increasing inductions. scev provides scev_direction for this (of course we don't have the chrec anymore here). Richard. > > Honza > > * tree-scalar-evolution.h (iv_can_overflow_p): Declare. > * tree-scalar-evolution.c (iv_can_overflow_p): New funcition. > (simple_iv): Use it. > * tree-ssa-loop-niter.c (nowrap_type_p): Use ANY_INTEGRAL_TYPE_P > > * gcc.dg/tree-ssa/scev-14.c: New testcase. > Index: tree-scalar-evolution.h > =================================================================== > --- tree-scalar-evolution.h (revision 237798) > +++ tree-scalar-evolution.h (working copy) > @@ -38,6 +38,7 @@ extern unsigned int scev_const_prop (voi > extern bool expression_expensive_p (tree); > extern bool simple_iv (struct loop *, struct loop *, tree, struct affine_iv > *, > bool); > +extern bool iv_can_overflow_p (struct loop *, tree, tree, tree); > extern tree compute_overall_effect_of_inner_loop (struct loop *, tree); > > /* Returns the basic block preceding LOOP, or the CFG entry block when > Index: tree-scalar-evolution.c > =================================================================== > --- tree-scalar-evolution.c (revision 237798) > +++ tree-scalar-evolution.c (working copy) > @@ -3309,6 +3310,70 @@ scev_reset (void) > } > } > > +/* Return true if the IV calculation in TYPE can overflow based on the > knowledge > + of the upper bound on the number of iterations of LOOP, the BASE and STEP > + of IV. > + > + We do not use information whether TYPE can overflow so it is safe to > + use this test even for derived IVs not computed every iteration or > + hypotetical IVs to be inserted into code. */ > + > +bool > +iv_can_overflow_p (struct loop *loop, tree type, tree base, tree step) > +{ > + widest_int nit, base_min, base_max, step_min, step_max, type_min, type_max; > + wide_int min, max; > + tree maxt, mint; > + > + if (TREE_CODE (base) == INTEGER_CST) > + base_min = base_max = wi::to_widest (base); > + else if (TREE_CODE (base) == SSA_NAME && !POINTER_TYPE_P (TREE_TYPE (base)) > + && get_range_info (base, &min, &max) == VR_RANGE) > + { > + base_min = widest_int::from (min, TYPE_SIGN (TREE_TYPE (base))); > + base_max = widest_int::from (max, TYPE_SIGN (TREE_TYPE (base))); > + } > + else > + return true; > + > + if (TREE_CODE (step) == INTEGER_CST) > + step_min = step_max = wi::to_widest (step); > + else if (TREE_CODE (step) == SSA_NAME && !POINTER_TYPE_P (TREE_TYPE (base)) > + && get_range_info (step, &min, &max) == VR_RANGE) > + { > + step_min = widest_int::from (min, TYPE_SIGN (TREE_TYPE (step))); > + step_max = widest_int::from (max, TYPE_SIGN (TREE_TYPE (step))); > + } > + else > + return true; > + > + if (!get_max_loop_iterations (loop, &nit)) > + /* 10GHz CPU runing one cycle loop will reach 2^60 iterations in 260 > + years. I won't live long enough to be forced to fix the > + miscompilation. Having the limit here will let us to consider > + 64bit IVs with base 0 and step 1...16 as non-wrapping which makes > + niter and ivopts go smoother. */ > + nit = ((widest_int)1 << 60); > + > + if (INTEGRAL_TYPE_P (type)) > + { > + maxt = TYPE_MAX_VALUE (type); > + mint = TYPE_MIN_VALUE (type); > + } > + else > + { > + maxt = upper_bound_in_type (type, type); > + mint = lower_bound_in_type (type, type); > + } > + > + type_min = wi::to_widest (mint); > + type_max = wi::to_widest (maxt); > + if ((base_max + step_max * (nit + 1)) > (type_max) > + || type_min > (base_min + step_min * (nit + 1))) > + return true; > + return false; > +} > + > /* Checks whether use of OP in USE_LOOP behaves as a simple affine iv with > respect to WRTO_LOOP and returns its base and step in IV if possible > (see analyze_scalar_evolution_in_loop for more details on USE_LOOP > @@ -3375,8 +3440,12 @@ simple_iv (struct loop *wrto_loop, struc > if (tree_contains_chrecs (iv->base, NULL)) > return false; > > - iv->no_overflow = (!folded_casts && ANY_INTEGRAL_TYPE_P (type) > - && TYPE_OVERFLOW_UNDEFINED (type)); > + iv->no_overflow = !folded_casts && nowrap_type_p (type); > + > + if (!iv->no_overflow > + && !iv_can_overflow_p (wrto_loop, type, iv->base, iv->step)) > + iv->no_overflow = true; > + > > /* Try to simplify iv base: > > Index: tree-ssa-loop-niter.c > =================================================================== > --- tree-ssa-loop-niter.c (revision 237798) > +++ tree-ssa-loop-niter.c (working copy) > @@ -4105,7 +4105,7 @@ n_of_executions_at_most (gimple *stmt, > bool > nowrap_type_p (tree type) > { > - if (INTEGRAL_TYPE_P (type) > + if (ANY_INTEGRAL_TYPE_P (type) > && TYPE_OVERFLOW_UNDEFINED (type)) > return true; > > Index: testsuite/gcc.dg/tree-ssa/scev-14.c > =================================================================== > --- testsuite/gcc.dg/tree-ssa/scev-14.c (revision 0) > +++ testsuite/gcc.dg/tree-ssa/scev-14.c (working copy) > @@ -0,0 +1,11 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-ivopts-details" } */ > +int a[100]; > +void t(unsigned int n) > +{ > + unsigned int i; > + for (i=0; i<n; i++) > + a[i]++; > +} > +/* { dg-final { scan-tree-dump "Overflowness wrto loop niter: > No-overflow" "ivopts" } } */ > +/* { dg-final { scan-tree-dump-not "Overflowness wrto loop niter: > Overflow" "ivopts" } } */ > > -- Richard Biener <rguent...@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)