Hi, this patch makes simple_iv to determine more often that IV can not overflow. First I commonized the logic in simple_iv with nowrap_type_p because it tests the same. Second I added iv_can_overflow_p which uses known upper bound on number of iteration to see if the IV calculation can overflow.
One interesting thig is that the ivcanon IV variables that goes from niter to 0 are believed to be wrapping. This is because the type is unsigned and -1 is then large number. It is not specified what overflow means, I suppose one can think of it as overflow in the calucaltion what sort of happens in this case. Inspecting the code I think both users (niter and ivopts) agrees with different interpretation that the induction can not wrap: that is the sequence produced is always monotonously increasing or decreasing. I would suggest to incrementally rename it to no_wrap_p and add comment in this sense and handle this case, too. It will need update at one place in ivopts where we are detemrining the direction of iteration: /* We need to know that the candidate induction variable does not overflow. While more complex analysis may be used to prove this, for now just check that the variable appears in the original program and that it is computed in a type that guarantees no overflows. */ cand_type = TREE_TYPE (cand->iv->base); if (cand->pos != IP_ORIGINAL || !nowrap_type_p (cand_type)) return false; .. step = int_cst_value (cand->iv->step); ... /* Determine the new comparison operator. */ comp = step < 0 ? GT_EXPR : LT_EXPR; (which is the only occurence of step). I guess we can add function iv_direction that will return -1,0,1 for monotonously decreasing, constant/unknown and monotonously increasing inductions. Honza * tree-scalar-evolution.h (iv_can_overflow_p): Declare. * tree-scalar-evolution.c (iv_can_overflow_p): New funcition. (simple_iv): Use it. * tree-ssa-loop-niter.c (nowrap_type_p): Use ANY_INTEGRAL_TYPE_P * gcc.dg/tree-ssa/scev-14.c: New testcase. Index: tree-scalar-evolution.h =================================================================== --- tree-scalar-evolution.h (revision 237798) +++ tree-scalar-evolution.h (working copy) @@ -38,6 +38,7 @@ extern unsigned int scev_const_prop (voi extern bool expression_expensive_p (tree); extern bool simple_iv (struct loop *, struct loop *, tree, struct affine_iv *, bool); +extern bool iv_can_overflow_p (struct loop *, tree, tree, tree); extern tree compute_overall_effect_of_inner_loop (struct loop *, tree); /* Returns the basic block preceding LOOP, or the CFG entry block when Index: tree-scalar-evolution.c =================================================================== --- tree-scalar-evolution.c (revision 237798) +++ tree-scalar-evolution.c (working copy) @@ -3309,6 +3310,70 @@ scev_reset (void) } } +/* Return true if the IV calculation in TYPE can overflow based on the knowledge + of the upper bound on the number of iterations of LOOP, the BASE and STEP + of IV. + + We do not use information whether TYPE can overflow so it is safe to + use this test even for derived IVs not computed every iteration or + hypotetical IVs to be inserted into code. */ + +bool +iv_can_overflow_p (struct loop *loop, tree type, tree base, tree step) +{ + widest_int nit, base_min, base_max, step_min, step_max, type_min, type_max; + wide_int min, max; + tree maxt, mint; + + if (TREE_CODE (base) == INTEGER_CST) + base_min = base_max = wi::to_widest (base); + else if (TREE_CODE (base) == SSA_NAME && !POINTER_TYPE_P (TREE_TYPE (base)) + && get_range_info (base, &min, &max) == VR_RANGE) + { + base_min = widest_int::from (min, TYPE_SIGN (TREE_TYPE (base))); + base_max = widest_int::from (max, TYPE_SIGN (TREE_TYPE (base))); + } + else + return true; + + if (TREE_CODE (step) == INTEGER_CST) + step_min = step_max = wi::to_widest (step); + else if (TREE_CODE (step) == SSA_NAME && !POINTER_TYPE_P (TREE_TYPE (base)) + && get_range_info (step, &min, &max) == VR_RANGE) + { + step_min = widest_int::from (min, TYPE_SIGN (TREE_TYPE (step))); + step_max = widest_int::from (max, TYPE_SIGN (TREE_TYPE (step))); + } + else + return true; + + if (!get_max_loop_iterations (loop, &nit)) + /* 10GHz CPU runing one cycle loop will reach 2^60 iterations in 260 + years. I won't live long enough to be forced to fix the + miscompilation. Having the limit here will let us to consider + 64bit IVs with base 0 and step 1...16 as non-wrapping which makes + niter and ivopts go smoother. */ + nit = ((widest_int)1 << 60); + + if (INTEGRAL_TYPE_P (type)) + { + maxt = TYPE_MAX_VALUE (type); + mint = TYPE_MIN_VALUE (type); + } + else + { + maxt = upper_bound_in_type (type, type); + mint = lower_bound_in_type (type, type); + } + + type_min = wi::to_widest (mint); + type_max = wi::to_widest (maxt); + if ((base_max + step_max * (nit + 1)) > (type_max) + || type_min > (base_min + step_min * (nit + 1))) + return true; + return false; +} + /* Checks whether use of OP in USE_LOOP behaves as a simple affine iv with respect to WRTO_LOOP and returns its base and step in IV if possible (see analyze_scalar_evolution_in_loop for more details on USE_LOOP @@ -3375,8 +3440,12 @@ simple_iv (struct loop *wrto_loop, struc if (tree_contains_chrecs (iv->base, NULL)) return false; - iv->no_overflow = (!folded_casts && ANY_INTEGRAL_TYPE_P (type) - && TYPE_OVERFLOW_UNDEFINED (type)); + iv->no_overflow = !folded_casts && nowrap_type_p (type); + + if (!iv->no_overflow + && !iv_can_overflow_p (wrto_loop, type, iv->base, iv->step)) + iv->no_overflow = true; + /* Try to simplify iv base: Index: tree-ssa-loop-niter.c =================================================================== --- tree-ssa-loop-niter.c (revision 237798) +++ tree-ssa-loop-niter.c (working copy) @@ -4105,7 +4105,7 @@ n_of_executions_at_most (gimple *stmt, bool nowrap_type_p (tree type) { - if (INTEGRAL_TYPE_P (type) + if (ANY_INTEGRAL_TYPE_P (type) && TYPE_OVERFLOW_UNDEFINED (type)) return true; Index: testsuite/gcc.dg/tree-ssa/scev-14.c =================================================================== --- testsuite/gcc.dg/tree-ssa/scev-14.c (revision 0) +++ testsuite/gcc.dg/tree-ssa/scev-14.c (working copy) @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-ivopts-details" } */ +int a[100]; +void t(unsigned int n) +{ + unsigned int i; + for (i=0; i<n; i++) + a[i]++; +} +/* { dg-final { scan-tree-dump "Overflowness wrto loop niter: No-overflow" "ivopts" } } */ +/* { dg-final { scan-tree-dump-not "Overflowness wrto loop niter: Overflow" "ivopts" } } */