http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57343
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rakdver at gcc dot gnu.org
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
The value C is equal to final - base, where final and base are the final and
initial value of the actual induction variable in the analysed loop. BNDS
bounds the value of this difference when computed in signed type with
unbounded range, while the computation of C is performed in an unsigned
type with the range matching the range of the type of the induction
variable.
In particular, BNDS.up contains an upper bound on C in the following cases:
-- if the iv must reach its final value without overflow, i.e., if
NO_OVERFLOW && EXIT_MUST_BE_TAKEN is true, or
-- if final >= base, which we know to hold when BNDS.below >= 0. */
static void
number_of_iterations_ne_max (mpz_t bnd, bool no_overflow, tree c, tree s,
bounds *bnds, bool exit_must_be_taken)
{
double_int max;
mpz_t d;
bool bnds_u_valid = ((no_overflow && exit_must_be_taken)
|| mpz_sgn (bnds->below) >= 0);
for the case in question (IV of unsigned char type, base
((unsigned char) (c_lsm.4_12 + -1)) * 100 and step 100 (negated 156))
the bnds values (below 0, up 255 - full range of unsigned char) the
fact that the IV overflows cannot be simply ignored - and I don't see
how we can derive from below >= 0 that final (which is zero!) is
>= base.
We then fall to
if (multiple_of_p (TREE_TYPE (c), c, s))
{
/* If C is an exact multiple of S, then its value will be reached before
the induction variable overflows (unless the loop is exited in some
other way before). Note that the actual induction variable in the
loop (which ranges from base to final instead of from 0 to C) may
overflow, in which case BNDS.up will not be giving a correct upper
bound on C; thus, BNDS_U_VALID had to be computed in advance. */
no_overflow = true;
exit_must_be_taken = true;
which ends up with no_overflow = true and things going downhill from here.
Note that the original, non-negated bnds would _not_ have mpz_sgn (bnds->below)
>= 0 (but -255):
/* Rearrange the terms so that we get inequality S * i <> C, with S
positive. Also cast everything to the unsigned type. If IV does
not overflow, BNDS bounds the value of C. Also, this is the
case if the computation |FINAL - IV->base| does not overflow, i.e.,
if BNDS->below in the result is nonnegative. */
if (tree_int_cst_sign_bit (iv->step))
{
s = fold_convert (niter_type,
fold_build1 (NEGATE_EXPR, type, iv->step));
c = fold_build2 (MINUS_EXPR, niter_type,
fold_convert (niter_type, iv->base),
fold_convert (niter_type, final));
bounds_negate (bnds);
Here |FINAL - IV->base| does overflow, but bounds_negate shadows this.
But even before that we have IVCANON remove the pointless exit
Removed pointless exit: if (i_11 <= 4)
Added canonical iv to loop 1, 3 iterations.
then it adds back the canonical IV, but that's pointless in the next
analysis run done by cunroll and removed there.
The bogus upper bound is determined from local_pure_const via finite_loop_p ()
on the non-header-copied form and the estimate from the h == 0 test.
Analyzing # of iterations of loop 1
exit condition [(unsigned char) (c_6 + -1) * 100, + , 156] != 0
bounds on difference of bases: -255 ... 0
result:
# of iterations (((unsigned char) (c_6 + -1) * 100) /[ex] 4) * 41 & 63,
bounded by 2
In the end I think that number_of_iterations_ne_max does not properly
honor the case where for
if (multiple_of_p (TREE_TYPE (c), c, s))
{
/* If C is an exact multiple of S, then its value will be reached before
the induction variable overflows (unless the loop is exited in some
other way before). Note that the actual induction variable in the
loop (which ranges from base to final instead of from 0 to C) may
overflow, in which case BNDS.up will not be giving a correct upper
bound on C; thus, BNDS_U_VALID had to be computed in advance. */
no_overflow = true;
that multiply overflows. At least the IV does not range from 0 to C
for (unsigned char) (c_6 + -1) * 100.
I have a hard time following the logic that this all is correct for
wrapping IVs ... that said,
Index: gcc/tree-ssa-loop-niter.c
===================================================================
--- gcc/tree-ssa-loop-niter.c (revision 199137)
+++ gcc/tree-ssa-loop-niter.c (working copy)
@@ -627,7 +627,7 @@ number_of_iterations_ne (tree type, affi
not overflow, BNDS bounds the value of C. Also, this is the
case if the computation |FINAL - IV->base| does not overflow, i.e.,
if BNDS->below in the result is nonnegative. */
- if (tree_int_cst_sign_bit (iv->step))
+ if (tree_int_cst_sgn (iv->step) == -1)
{
s = fold_convert (niter_type,
fold_build1 (NEGATE_EXPR, type, iv->step));
fixes the testcase (otherwise untested).