> > Index: tree-scalar-evolution.c > > =================================================================== > > --- tree-scalar-evolution.c (revision 237856) > > +++ tree-scalar-evolution.c (working copy) > > @@ -280,6 +280,7 @@ along with GCC; see the file COPYING3. > > #include "params.h" > > #include "tree-ssa-propagate.h" > > #include "gimple-fold.h" > > +#include "print-tree.h" > > Don't see you need this.
Yes, i forgot this from debugging. > > > static tree analyze_scalar_evolution_1 (struct loop *, tree, tree); > > static tree analyze_scalar_evolution_for_address_of (struct loop *loop, > > @@ -3309,6 +3310,60 @@ scev_reset (void) > > } > > } > > > > +/* Return true if the IV calculation in TYPE can overflow based on the > > knowledge > > + of the upper bound on the number of iterations of LOOP, the BASE and > > STEP > > + of IV. > > + > > + We do not use information whether TYPE can overflow so it is safe to > > + use this test even for derived IVs not computed every iteration or > > + hypotetical IVs to be inserted into code. */ > > + > > +bool > > +iv_can_overflow_p (struct loop *loop, tree type, tree base, tree step) > > Exporting this is also not necessary? The reason why I export this is that incrementally I plan to use it in ivopts. It is constructing its own candidates and also needs to know if they will overflow or not. I will drop it for now and include it in incremental patch. > > > +{ > > + widest_int nit; > > + wide_int base_min, base_max, step_min, step_max, type_min, type_max; > > + signop sgn = TYPE_SIGN (type); > > + signop base_sgn = TYPE_SIGN (TREE_TYPE (base)); > > + signop step_sgn = TYPE_SIGN (TREE_TYPE (step)); > > + > > + if (step == 0) > > + return false; > > Err - you probably mean > > if (integer_zerop (step)) > > here? your check is a NULL pointer check ... Yes, sorry. inteer_zerop (step). Probably does not matter. > > + type_min = wi::min_value (type); > > + type_max = wi::max_value (type); > > + /* Watch overflow. */ > > + if ((widest_int)1 << TYPE_PRECISION (type) < nit) > > + return true; > > TYPE_PRECISION (type) - 1? The comment can be improved I think. For type==char I think TYPE_PRECISION should be 8 and useful nit values are in range 0...255 (because of the +1 addition bellow). So I think it should be if ((widest_int)1 << TYPE_PRECISION (type) < nit) > > > + if ((widest_int::from (base_max, base_sgn) > > + + widest_int::from (step_max, step_sgn) * (nit + 1)) > > + > widest_int::from (type_max, sgn) > > + || (widest_int::from (type_min, sgn) > > + > (widest_int::from (base_min, base_sgn) > > + + widest_int::from (step_min, step_sgn) * (nit + 1)))) > > + return true; > > and this lacks any comment... so it decodes to > > (base_max + step_max * (nit + 1) > type_max) > || (type_min > base_min + step_min * (nit + 1)) Yes, it is trying to compute the final value of IV when loop reaches maximal number of iteration and check that it is in the range of the target type. > > and it basically assumes infinite precision arithmetic for > the computation. As mentioned previously for __int128 widest_int > does _not_ guarantee this so you need to use FIXED_WIDE_INT > with a precision of WIDE_INT_MAX_PRECISION * 2 (+1?) like VRP does. With the NIT check I know that all 4 values are representable in a target type. 3) widest_int. This representation is an approximation of infinite precision math. However, it is not really infinite precision math as in the GMP library. It is really finite precision math where the precision is 4 times the size of the largest integer that the target port can represent. My understanding is that to do the mutiplcation I need two times of the bits of TYPE_PRECISION (type) (+2 for the sign and addition perhaps) and widest_int has 4 times. If this is not true, perhaps the comment can be made more explicit? Thanks for the pointer to vrp, now at least I see how the widening/explicit precision work. I suppose then the type should be FIXED_WIDE_INT (WIDE_INT_MAX_PRECISION * 2 + 2)? > > Note as both type_min/max and base_min/max are wide_ints the above > might be simplified > > step_max * (nit + 1) > type_max - base_max > > where the subtraction can be carried out in wide_int(?) and you > should also CSE nit + 1 or transform it further to > > step_max * nit > type_max - base_max - step_max > > where if I understand correctly, if type_max - base_max - step_max > underflows we already wrap. Yes, this look like a good idea and yes, if type_max-base_max-step_max is negative, we wrap. I suppose even with some inlining doing manual CSE is not bad plan. Thanks, wide_int.h is still bit of black magic to me ;) Honza > > Thanks, > Richard.