https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110218
Bug ID: 110218 Summary: sink pass heuristic not working in practice Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- In r0-112722-g1cc17820c32e8b Jeff introduced --param sink-frequency-threshold replacing - /* Move the expression to a post dominator can't reduce the number of - executions. */ - if (dominated_by_p (CDI_POST_DOMINATORS, frombb, sinkbb)) - return false; with /* If BEST_BB is at the same nesting level, then require it to have significantly lower execution frequency to avoid gratuitous movement. */ if (bb_loop_depth (best_bb) == bb_loop_depth (early_bb) /* If result of comparsion is unknown, prefer EARLY_BB. Thus use !(...>=..) rather than (...<...) */ && !(best_bb->count * 100 >= early_bb->count * threshold)) return best_bb; where the logic of course is that when best_bb post-dominates early_bb then the counts should be the same. Other than that the new logic prevents some partial dead code elimination. What the sinking pass does is, in addition to moving code to only conditional executed places, it performs scheduling by moving the code to the latest "best" basic block, not the earliest candidate. The above logic might also catch EH and error control flow, avoiding sinking along effective fallthru edges. But rather than applying this only at the end we should at least consider doing it in a loop walking the dominators similar to the one selecting the best candidate while (temp_bb != early_bb) { /* If we've moved into a lower loop nest, then that becomes our best block. */ if (bb_loop_depth (temp_bb) < bb_loop_depth (best_bb)) best_bb = temp_bb; /* Walk up the dominator tree, hopefully we'll find a shallower loop nest. */ temp_bb = get_immediate_dominator (CDI_DOMINATORS, temp_bb); } I'm seeing guessed profile where the dominated block has higher count than the dominating block ... Where to sink a stmt to in an effectively (ignoring error paths) post-dominance region is a scheduling and register pressure problem where also call ABI (in case of EH) matters. Arguably that's none of GIMPLEs business.