https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115825

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
We also apply a 2/3 factor for an estimate on code reduction due to followup
optimization (we've worked on this not long ago - it seems to be critical for
some testcases derived from SPEC for example).  We're likely hitting

static bool
try_unroll_loop_completely (class loop *loop,
                            edge exit, tree niter, bool may_be_zero,
                            enum unroll_level ul,
                            HOST_WIDE_INT maxiter,
                            dump_user_location_t locus, bool allow_peel,
                            bool cunrolli)
{
...
          unsigned HOST_WIDE_INT ninsns = size.overall;
          unsigned HOST_WIDE_INT unr_insns
            = estimated_unrolled_size (&size, n_unroll);
          if (dump_file && (dump_flags & TDF_DETAILS))
            {
              fprintf (dump_file, "  Loop size: %d\n", (int) ninsns);
              fprintf (dump_file, "  Estimated size after unrolling: %d\n",
                       (int) unr_insns);
            }

          /* If the code is going to shrink, we don't need to be extra
             cautious on guessing if the unrolling is going to be
             profitable.
             Move from estimated_unrolled_size to unroll small loops.  */
          if (unr_insns * 2 / 3
              /* If there is IV variable that will become constant, we
                 save one instruction in the loop prologue we do not
                 account otherwise.  */
              <= ninsns + (size.constant_iv != false))
            ;

but note the estimate is already suspiciously low.

Btw, on trunk I see

Loop 1 iterates 8 times.
Loop 1 iterates at most 8 times.
Loop 1 likely iterates at most 8 times.
Estimating sizes for loop 1
 BB: 4, after_exit: 0
  size:   2 if (i_3 != 0)
   Exit condition will be eliminated in peeled copies.
   Exit condition will be eliminated in last copy.
   Constant conditional.
 BB: 3, after_exit: 1
  size:   1 v ={v} i_3;
  size:   0 i.0_1 = (unsigned char) i_3;
   Induction variable computation will be folded away.
  size:   1 _2 = i.0_1 + 254;
   Induction variable computation will be folded away.
  size:   0 i_7 = (char) _2;
   Induction variable computation will be folded away.
size: 4-3, last_iteration: 2-2
  Loop size: 4
  Estimated size after unrolling: 8
Not unrolling loop 1: size would grow.

I don't have a avr cross on the 14 branch to cross-check - can you paste
more context from the cunroll[i] details dump?

Reply via email to