This was yet another problem issue with threading through a loop backedge and finding equivalences that should have been invalidated.
In this instance, we were trying to thread through a large block. When we hit the statement threshold, thread_through_normal_block returned and thus statements later in the block which would have invalidated equivalences that we no longer valid after following a backedge were never examined.
So those equivalences were still in the tables. We then proceeded to try and use the block as a joiner and thread through one or more of its successors.
That's, of course, stupid. If we determined that the block was too big for normal threading, we certainly don't want to thread through it as a joiner either since that duplicates the block as well. So it's both a codesize and correctness issue.
This patch changes thread_through_normal_block to signal to its caller that the block was not fully processed due to code growth considerations. When that happens, we avoid trying to thread through the block's successors. That fixes both the code growth problem as well as the correctness issue.
Bootstrapped and regression tested on x86_64-unknown-linux-gnu. Installed on the trunk. Will backport this and the other jump threading fix to the 4.9 branch shortly.
diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 8e8b76e..0b27fc8 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,13 @@ +2014-05-08 Jeff Law <l...@redhat.com> + + PR tree-optimization/61009 + * tree-ssa-threadedge.c (thread_through_normal_block): Return a + tri-state rather than a boolean. When a block is too big to + thread through, inform caller via negative return value. + (thread_across_edge): If a block was too big for normal threading, + then it's too big for a joiner too, so remove temporary equivalences + and return immediately. + 2014-05-08 Manuel López-Ibáñez <m...@gcc.gnu.org> Matthias Klose <d...@ubuntu.com> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 2dcf9dc..959763f 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,8 @@ +2014-05-08 Jeff Law <l...@redhat.com> + + PR tree-optimization/61009 + * g++.dg/tree-ssa/pr61009.C: New test. + 2014-05-08 Matthias Klose <d...@ubuntu.com> PR driver/61106 diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr61009.C b/gcc/testsuite/g++.dg/tree-ssa/pr61009.C new file mode 100644 index 0000000..4e7bb1a --- /dev/null +++ b/gcc/testsuite/g++.dg/tree-ssa/pr61009.C @@ -0,0 +1,53 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-tree-vrp -std=c++11 -fno-strict-aliasing -fdump-tree-dom1" } */ + +#include <stdio.h> +struct Field { + virtual int Compare(void*, void*); +}; +extern int NKF, NR; +extern int idxs[]; +extern Field* the_field; +extern int *incs; +extern char** fptrs; +inline int doCmp(int this_row_offset, int field_idx) { + void *p = fptrs[field_idx] + this_row_offset * incs[field_idx]; + return the_field->Compare(p,0); +} +bool Test(void) { + + int row_offset = 0; + + for (; row_offset < NR; ++row_offset) { + + bool is_different = false; + for (int j = 0; j < NKF ; ++j) { + int field_idx = idxs[j]; + int cmp = doCmp(row_offset, field_idx); + fprintf (stderr, "cmp=%d\n",cmp); + + if (cmp == 0) { + continue; + } + if (cmp > 0) { + is_different = true; + break; + } else { + fprintf (stderr, "Incorrect\n"); + return false; + } + } + if (!is_different) { + + return false; + } + } + + return true; +} + +// The block ending with cmp == 0 should not be threaded. ie, +// there should be a single == 0 comparison in the dump file. + +// { dg-final { scan-tree-dump-times "== 0" 1 "dom1" } } +// { dg-final { cleanup-tree-dump "dom1" } } diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c index 7621348..8e628d5 100644 --- a/gcc/tree-ssa-threadedge.c +++ b/gcc/tree-ssa-threadedge.c @@ -966,9 +966,14 @@ thread_around_empty_blocks (edge taken_edge, SIMPLIFY is a pass-specific function used to simplify statements. Our caller is responsible for restoring the state of the expression - and const_and_copies stacks. */ + and const_and_copies stacks. -static bool + Positive return value is success. Zero return value is failure, but + the block can still be duplicated as a joiner in a jump thread path, + negative indicates the block should not be duplicated and thus is not + suitable for a joiner in a jump threading path. */ + +static int thread_through_normal_block (edge e, gimple dummy_cond, bool handle_dominating_asserts, @@ -990,7 +995,7 @@ thread_through_normal_block (edge e, /* PHIs create temporary equivalences. */ if (!record_temporary_equivalences_from_phis (e, stack, *backedge_seen_p, src_map, dst_map)) - return false; + return 0; /* Now walk each statement recording any context sensitive temporary equivalences we can detect. */ @@ -998,8 +1003,16 @@ thread_through_normal_block (edge e, = record_temporary_equivalences_from_stmts_at_dest (e, stack, simplify, *backedge_seen_p, src_map, dst_map); + + /* If we didn't look at all the statements, the most likely reason is + there were too many and thus duplicating this block is not profitable. + + Also note if we do not look at all the statements, then we may not + have invalidated equivalences that are no longer valid if we threaded + around a loop. Thus we must signal to our caller that this block + is not suitable for use as a joiner in a threading path. */ if (!stmt) - return false; + return -1; /* If we stopped at a COND_EXPR or SWITCH_EXPR, see if we know which arm will be taken. */ @@ -1023,7 +1036,7 @@ thread_through_normal_block (edge e, if (dest == NULL || dest == e->dest || bitmap_bit_p (visited, dest->index)) - return false; + return 0; /* Only push the EDGE_START_JUMP_THREAD marker if this is first edge on the path. */ @@ -1057,10 +1070,10 @@ thread_through_normal_block (edge e, visited, path, backedge_seen_p); - return true; + return 1; } } - return false; + return 0; } /* We are exiting E->src, see if E->dest ends with a conditional @@ -1112,9 +1125,12 @@ thread_across_edge (gimple dummy_cond, if (backedge_seen) simplify = dummy_simplify; - if (thread_through_normal_block (e, dummy_cond, handle_dominating_asserts, - stack, simplify, path, visited, - &backedge_seen, src_map, dst_map)) + int threaded = thread_through_normal_block (e, dummy_cond, + handle_dominating_asserts, + stack, simplify, path, + visited, &backedge_seen, + src_map, dst_map); + if (threaded > 0) { propagate_threaded_block_debug_into (path->last ()->e->dest, e->dest); @@ -1127,10 +1143,27 @@ thread_across_edge (gimple dummy_cond, } else { - /* There should be no edges on the path, so no need to walk through - the vector entries. */ + /* Negative and zero return values indicate no threading was possible, + thus there should be no edges on the thread path and no need to walk + through the vector entries. */ gcc_assert (path->length () == 0); path->release (); + + /* A negative status indicates the target block was deemed too big to + duplicate. Just quit now rather than trying to use the block as + a joiner in a jump threading path. + + This prevents unnecessary code growth, but more importantly if we + do not look at all the statements in the block, then we may have + missed some invalidations if we had traversed a backedge! */ + if (threaded < 0) + { + BITMAP_FREE (visited); + BITMAP_FREE (src_map); + BITMAP_FREE (dst_map); + remove_temporary_equivalences (stack); + return; + } } /* We were unable to determine what out edge from E->dest is taken. However, @@ -1212,7 +1245,7 @@ thread_across_edge (gimple dummy_cond, handle_dominating_asserts, stack, simplify, path, visited, &backedge_seen, - src_map, dst_map); + src_map, dst_map) > 0; /* If we were able to thread through a successor of E->dest, then record the jump threading opportunity. */