After shrink-wrapping has found the "tightest fit" for where to place the prologue, it tries move it earlier (so that frame saves are run earlier) -- but without copying any more basic blocks.
Unfortunately a candidate block we select can be inside a loop, and we will still allow it (because the loop always exits via our previously chosen block). We can do that just fine if we make a duplicate of the block, but we do not want to here. So we need to detect this situation. We can place the prologue at a previous block PRE only if PRE dominates every block reachable from it, because then we will never need to duplicate that block (it will always be executed with prologue). Tested on the two testcases from the PRs. Also regression checked on powerpc64-linux (actually, that is still running). Is this okay for trunk? Segher 2015-12-07 Segher Boessenkool <seg...@kernel.crashing.org> PR rtl-optimization/67778 PR rtl-optimization/68634 * shrink-wrap.c (try_shrink_wrapping): Add a comment about why we want to put the prologue earlier. When determining if an earlier block is suitable, make sure it dominates every block reachable from it. --- gcc/shrink-wrap.c | 41 ++++++++++++++++++++++++++++++++++++----- 1 file changed, 36 insertions(+), 5 deletions(-) diff --git a/gcc/shrink-wrap.c b/gcc/shrink-wrap.c index 3a1df84..89788e0 100644 --- a/gcc/shrink-wrap.c +++ b/gcc/shrink-wrap.c @@ -744,33 +744,64 @@ try_shrink_wrapping (edge *entry_edge, bitmap_head *bb_with, vec.quick_push (e->dest); } - vec.release (); - if (dump_file) fprintf (dump_file, "Avoiding non-duplicatable blocks, PRO is now %d\n", pro->index); /* If we can move PRO back without having to duplicate more blocks, do so. + We do this because putting the prologue earlier is better for scheduling. We can move back to a block PRE if every path from PRE will eventually - need a prologue, that is, PRO is a post-dominator of PRE. */ + need a prologue, that is, PRO is a post-dominator of PRE. PRE needs + to dominate every block reachable from itself. */ if (pro != entry) { calculate_dominance_info (CDI_POST_DOMINATORS); + bitmap bb_tmp = BITMAP_ALLOC (NULL); + bitmap_copy (bb_tmp, bb_with); basic_block last_ok = pro; + vec.truncate (0); + while (pro != entry) { basic_block pre = get_immediate_dominator (CDI_DOMINATORS, pro); if (!dominated_by_p (CDI_POST_DOMINATORS, pre, pro)) break; + if (bitmap_set_bit (bb_tmp, pre->index)) + vec.quick_push (pre); + + bool ok = true; + while (!vec.is_empty ()) + { + basic_block bb = vec.pop (); + bitmap_set_bit (bb_tmp, pre->index); + + if (!dominated_by_p (CDI_DOMINATORS, bb, pre)) + { + ok = false; + break; + } + + FOR_EACH_EDGE (e, ei, bb->succs) + if (!bitmap_bit_p (bb_with, e->dest->index) + && bitmap_set_bit (bb_tmp, e->dest->index)) + vec.quick_push (e->dest); + } + + if (ok) + last_ok = pre; + else + break; + pro = pre; - if (can_get_prologue (pro, prologue_clobbered)) - last_ok = pro; } + pro = last_ok; + BITMAP_FREE (bb_tmp); + vec.release (); free_dominance_info (CDI_POST_DOMINATORS); } -- 1.9.3