https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110218

            Bug ID: 110218
           Summary: sink pass heuristic not working in practice
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

In r0-112722-g1cc17820c32e8b Jeff introduced --param sink-frequency-threshold
replacing

-  /* Move the expression to a post dominator can't reduce the number of
-     executions.  */
-  if (dominated_by_p (CDI_POST_DOMINATORS, frombb, sinkbb))
-    return false;

with

  /* If BEST_BB is at the same nesting level, then require it to have
     significantly lower execution frequency to avoid gratuitous movement.  */
  if (bb_loop_depth (best_bb) == bb_loop_depth (early_bb)
      /* If result of comparsion is unknown, prefer EARLY_BB.
         Thus use !(...>=..) rather than (...<...)  */
      && !(best_bb->count * 100 >= early_bb->count * threshold))
    return best_bb;

where the logic of course is that when best_bb post-dominates early_bb
then the counts should be the same.  Other than that the new logic
prevents some partial dead code elimination.

What the sinking pass does is, in addition to moving code to only
conditional executed places, it performs scheduling by moving the
code to the latest "best" basic block, not the earliest candidate.

The above logic might also catch EH and error control flow, avoiding
sinking along effective fallthru edges.  But rather than applying this
only at the end we should at least consider doing it in a loop
walking the dominators similar to the one selecting the best candidate

  while (temp_bb != early_bb)
    {
      /* If we've moved into a lower loop nest, then that becomes
         our best block.  */
      if (bb_loop_depth (temp_bb) < bb_loop_depth (best_bb))
        best_bb = temp_bb;

      /* Walk up the dominator tree, hopefully we'll find a shallower
         loop nest.  */
      temp_bb = get_immediate_dominator (CDI_DOMINATORS, temp_bb);
    }

I'm seeing guessed profile where the dominated block has higher count
than the dominating block ...

Where to sink a stmt to in an effectively (ignoring error paths)
post-dominance region is a scheduling and register pressure problem
where also call ABI (in case of EH) matters.  Arguably that's none
of GIMPLEs business.

Reply via email to