4.9 Regression] wrong code on x86_64-linux at -Os and above

rguenth at gcc dot gnu.org Tue, 21 May 2013 04:54:10 -0700

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57343


Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rakdver at gcc dot gnu.org

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
   The value C is equal to final - base, where final and base are the final and
   initial value of the actual induction variable in the analysed loop.  BNDS
   bounds the value of this difference when computed in signed type with
   unbounded range, while the computation of C is performed in an unsigned
   type with the range matching the range of the type of the induction
variable.
   In particular, BNDS.up contains an upper bound on C in the following cases:
   -- if the iv must reach its final value without overflow, i.e., if
      NO_OVERFLOW && EXIT_MUST_BE_TAKEN is true, or
   -- if final >= base, which we know to hold when BNDS.below >= 0.  */

static void
number_of_iterations_ne_max (mpz_t bnd, bool no_overflow, tree c, tree s,
                             bounds *bnds, bool exit_must_be_taken)
{
  double_int max;
  mpz_t d;
  bool bnds_u_valid = ((no_overflow && exit_must_be_taken)
                       || mpz_sgn (bnds->below) >= 0);

for the case in question (IV of unsigned char type, base
((unsigned char) (c_lsm.4_12 + -1)) * 100 and step 100 (negated 156))
the bnds values (below 0, up 255 - full range of unsigned char) the
fact that the IV overflows cannot be simply ignored - and I don't see
how we can derive from below >= 0 that final (which is zero!) is
>= base.

We then fall to

  if (multiple_of_p (TREE_TYPE (c), c, s))
    {
      /* If C is an exact multiple of S, then its value will be reached before
         the induction variable overflows (unless the loop is exited in some
         other way before).  Note that the actual induction variable in the
         loop (which ranges from base to final instead of from 0 to C) may
         overflow, in which case BNDS.up will not be giving a correct upper
         bound on C; thus, BNDS_U_VALID had to be computed in advance.  */
      no_overflow = true;
      exit_must_be_taken = true;

which ends up with no_overflow = true and things going downhill from here.

Note that the original, non-negated bnds would _not_ have mpz_sgn (bnds->below)
>= 0 (but -255):

  /* Rearrange the terms so that we get inequality S * i <> C, with S
     positive.  Also cast everything to the unsigned type.  If IV does
     not overflow, BNDS bounds the value of C.  Also, this is the
     case if the computation |FINAL - IV->base| does not overflow, i.e.,
     if BNDS->below in the result is nonnegative.  */
  if (tree_int_cst_sign_bit (iv->step))
    {
      s = fold_convert (niter_type,
                        fold_build1 (NEGATE_EXPR, type, iv->step));
      c = fold_build2 (MINUS_EXPR, niter_type,
                       fold_convert (niter_type, iv->base),
                       fold_convert (niter_type, final));
      bounds_negate (bnds);

Here |FINAL - IV->base| does overflow, but bounds_negate shadows this.


But even before that we have IVCANON remove the pointless exit

Removed pointless exit: if (i_11 <= 4)
Added canonical iv to loop 1, 3 iterations.

then it adds back the canonical IV, but that's pointless in the next
analysis run done by cunroll and removed there.

The bogus upper bound is determined from local_pure_const via finite_loop_p ()
on the non-header-copied form and the estimate from the h == 0 test.

Analyzing # of iterations of loop 1
  exit condition [(unsigned char) (c_6 + -1) * 100, + , 156] != 0
  bounds on difference of bases: -255 ... 0
  result:
    # of iterations (((unsigned char) (c_6 + -1) * 100) /[ex] 4) * 41 & 63,
bounded by 2

In the end I think that number_of_iterations_ne_max does not properly
honor the case where for

  if (multiple_of_p (TREE_TYPE (c), c, s))
    {
      /* If C is an exact multiple of S, then its value will be reached before
         the induction variable overflows (unless the loop is exited in some
         other way before).  Note that the actual induction variable in the
         loop (which ranges from base to final instead of from 0 to C) may
         overflow, in which case BNDS.up will not be giving a correct upper
         bound on C; thus, BNDS_U_VALID had to be computed in advance.  */
      no_overflow = true;

that multiply overflows.  At least the IV does not range from 0 to C
for (unsigned char) (c_6 + -1) * 100.

I have a hard time following the logic that this all is correct for
wrapping IVs ... that said,

Index: gcc/tree-ssa-loop-niter.c
===================================================================
--- gcc/tree-ssa-loop-niter.c   (revision 199137)
+++ gcc/tree-ssa-loop-niter.c   (working copy)
@@ -627,7 +627,7 @@ number_of_iterations_ne (tree type, affi
      not overflow, BNDS bounds the value of C.  Also, this is the
      case if the computation |FINAL - IV->base| does not overflow, i.e.,
      if BNDS->below in the result is nonnegative.  */
-  if (tree_int_cst_sign_bit (iv->step))
+  if (tree_int_cst_sgn (iv->step) == -1)
     {
       s = fold_convert (niter_type,
                        fold_build1 (NEGATE_EXPR, type, iv->step));

fixes the testcase (otherwise untested).

[Bug tree-optimization/57343] [4.8/4.9 Regression] wrong code on x86_64-linux at -Os and above

Reply via email to