https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108352

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |law at gcc dot gnu.org

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Checking profitability of path (backwards):  bb:3 (6 insns) bb:9 (0 insns)
(latch) bb:5
  Control statement insns: 2
  Overall: 4 insns
  [4] Registering jump thread: (5, 9) incoming edge;  (9, 3) normal (back) (3,
4) nocopy;
path: 5->9->3->4 SUCCESS

but

Checking profitability of path (backwards):  bb:3 (6 insns) bb:9 (latch)
  Control statement insns: 2
  Overall: 4 insns
  FAIL: Would create irreducible loop without threading multiway branch.
path: 9->3->xx REJECTED

we are no longer considering the first which just adds an unrelated jump
to the path after the patch.  That's the

  /* We avoid creating irreducible inner loops unless we thread through
     a multiway branch, in which case we have deemed it worth losing
     other loop optimizations later.

     We also consider it worth creating an irreducible inner loop if
     the number of copied statement is low relative to the length of
     the path -- in that case there's little the traditional loop
     optimizer would have done anyway, so an irreducible loop is not
     so bad.  */
  if (!threaded_multiway_branch
      && creates_irreducible_loop
      && *creates_irreducible_loop
      && (n_insns * (unsigned) param_fsm_scale_path_stmts
          > (m_path.length () *
             (unsigned) param_fsm_scale_path_blocks)))

    {
      if (dump_file && (dump_flags & TDF_DETAILS))
        fprintf (dump_file,
                 "  FAIL: Would create irreducible loop without threading "
                 "multiway branch.\n");
      return false;

heuristic which with 9 -> 3 is 4 * 2 > 2 * 3 but with 5 -> 9 -> 3 we
get 4 * 2 > 3 * 3.

It's also worth noting that neither of the two threads create an irreducible
loop in the end for this particular case since e is also constant on entry
and thus the jump is resolved and the extra loop entry is removed (but
that's out of scope of the threaders analysis here).

It IMHO still makes no sense to reject the shorter path over the longer one
so the above "heuristic" makes absolutely no sense to me.  Raising
--param fsm-scale-path-blocks to 4 "fixes" the testcase on trunk.

The heuristic was added in r6-6600-g2b572b3c213b51 by Jeff in the attempt
to address a coremark regression (PR68398).  I guess Jeff remembers nothing
about this.

Note this is not about adding inner irreducible loops but making loop
itself irreducible.  The length of the path itself also says nothing
about the length of a path through the irreducible loop ...

Reverting the heuristic will reject all non-multi-way branch irreducible
loop creation.  We have another heuristic that rejects threading through
the latch early:

  /* Threading through an empty latch would cause code to be added to
     the latch.  This could alter the loop form sufficiently to cause
     loop optimizations to fail.  Disable these threads until after
     loop optimizations have run.  */
  if ((threaded_through_latch
       || (taken_edge && taken_edge->dest == loop->latch))
      && !(cfun->curr_properties & PROP_loop_opts_done)
      && empty_block_p (loop->latch))

so we could reject irreducible loops before loop opts (w/o just covering
the empty latch case) and otherwise generally allow it even for
non-multi-way branches.

That said, I fear I'm going to replace one bogus heuristic with another ;)

I'm still going to test replacing the heuristic with the following
(which allows to remove the fsm-scale-path-blocks param).

  /* We avoid creating irreducible inner loops unless we thread through
     a multiway branch, in which case we have deemed it worth losing
     other loop optimizations later.

     We also consider it worth creating an irreducible inner loop after
     loop optimizations if the number of copied statement is low.  */
  if (!m_threaded_multiway_branch
      && *creates_irreducible_loop
      && (!(cfun->curr_properties & PROP_loop_opts_done)
          || (m_n_insns * param_fsm_scale_path_stmts
              >= param_max_jump_thread_duplication_stmts)))
    {
      if (dump_file && (dump_flags & TDF_DETAILS))
        fprintf (dump_file,
                 "  FAIL: Would create irreducible loop early without "
                 "threading multiway branch.\n");
      /* We compute creates_irreducible_loop only late.  */
      return false; 
    }

Reply via email to