[ Returning to another old patch... ]
On 11/07/2017 10:33 AM, Aldy Hernandez wrote:
> [One more time, but without rejected HTML mail, because apparently this
> is my first post to gcc-patches *ever* ;-)].
>
> Howdy!
>
> While poking around in the backwards threader I noticed that we bail if
> we have already seen a starting BB.
>
> /* Do not jump-thread twice from the same block. */
> if (bitmap_bit_p (threaded_blocks, entry->src->index)
>
> This limitation discards paths that are sub-paths of paths that have
> already been threaded.
>
> The following patch scans the remaining to-be-threaded paths to identify
> if any of them start from the same point, and are thus sub-paths of the
> just-threaded path. By removing the common prefix of blocks in upcoming
> threadable paths, and then rewiring first non-common block
> appropriately, we expose new threading opportunities, since we are no
> longer starting from the same BB. We also simplify the would-be
> threaded paths, because we don't duplicate already duplicated paths.
>
> This sounds confusing, but the documentation for the entry point to my
> patch (adjust_paths_after_duplication) shows an actual example:
>
> +/* After an FSM path has been jump threaded, adjust the remaining FSM
> + paths that are subsets of this path, so these paths can be safely
> + threaded within the context of the new threaded path.
> +
> + For example, suppose we have just threaded:
> +
> + 5 -> 6 -> 7 -> 8 -> 12 => 5 -> 6' -> 7' -> 8' -> 12'
> +
> + And we have an upcoming threading candidate:
> + 5 -> 6 -> 7 -> 8 -> 15 -> 20
> +
> + This function adjusts the upcoming path into:
> + 8' -> 15 -> 20
>
> Notice that we will no longer have two paths that start from the same
> BB. One will start with bb5, while the adjusted path will start with
> bb8'. With this we kill two birds-- we are able to thread more paths,
> and these paths will avoid duplicating a whole mess of things that have
> already been threaded.
>
> The attached patch is a subset of some upcoming work that can live on
> its own. It bootstraps and regtests. Also, by running it on a handful
> of .ii files, I can see that we are able to thread sub-paths that we
> previously dropped on the floor. More is better, right? :)
>
> To test this, I stole Jeff's method of using cachegrind to benchmark
> instruction counts and conditional branches
> (https://gcc.gnu.org/ml/gcc-patches/2013-11/msg02434.html).
>
> Basically, I bootstrapped two compilers, with and without improvements,
> and used each to build a stage1 trunk. Each of these stage1-trunks were
> used on 246 .ii GCC files I have lying around from a bootstrap some
> random point last year. I used no special flags on builds apart from
> --enable-languages=c,c++.
>
> Although I would've wished a larger improvement, this works comes for
> free, as it's just a subset of other work I'm hacking on.
>
> Without further ado, here are my monumental, earth shattering improvements:
>
> Conditional branches
> Without patch: 411846839709
> With patch: 411831330316
> %changed: -0.0037660%
>
> Number of instructions
> Without patch: 2271844650634
> With patch: 2271761153578
> %changed: -0.0036754%
>
>
> OK for trunk?
> Aldy
>
> p.s. There is a lot of debugging/dumping code in my patch, which I can
> gladly remove if/when approved. It helps keep my head straight while
> looking at this spaghetti :).
>
> curr.patch
>
>
> gcc/
>
> * tree-ssa-threadupdate.c (mark_threaded_blocks): Avoid
> dereferencing path[] beyond its length.
> (debug_path): New.
> (debug_all_paths): New.
> (rewire_first_differing_edge): New.
> (adjust_paths_after_duplication): New.
> (duplicate_thread_path): Call adjust_paths_after_duplication.
> Add new argument.
> (thread_through_all_blocks): Add new argument to
> duplicate_thread_path.
This is fine for the trunk. I'd keep the dumping code as-is. It'll be
useful in the future :-)
>
> diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
> index 1dab0f1fab4..53ac7181b4b 100644
> --- a/gcc/tree-ssa-threadupdate.c
> +++ b/gcc/tree-ssa-threadupdate.c
> +
> +/* Rewire a jump_thread_edge so that the source block is now a
> + threaded source block.
> +
> + PATH_NUM is an index into the global path table PATHS.
> + EDGE_NUM is the jump thread edge number into said path.
> +
> + Returns TRUE if we were able to successfully rewire the edge. */
> +
> +static bool
> +rewire_first_differing_edge (unsigned path_num, unsigned edge_num)
> +{
> + vec<jump_thread_edge *> *path = paths[path_num];
> + edge &e = (*path)[edge_num]->e;
> + if (dump_file && (dump_flags & TDF_DETAILS))
> + fprintf (dump_file, "rewiring edge candidate: %d -> %d\n",
> + e->src->index, e->dest->index);
> + basic_block src_copy = get_bb_copy (e->src);
> + if (src_copy == NULL)
> + {
> + if (dump_file && (dump_flags & TDF_DETAILS))
> + fprintf (dump_file, "ignoring candidate: there is no src COPY\n");
> + return false;
> + }
> + edge new_edge = find_edge (src_copy, e->dest);
> + /* If the previously threaded paths created a flow graph where we
> + can no longer figure out where to go, give up. */
> + if (new_edge == NULL)
> + {
> + if (dump_file && (dump_flags & TDF_DETAILS))
> + fprintf (dump_file, "ignoring candidate: we lost our way\n");
> + return false;
> + }
> + e = new_edge;
> + return true;
I was at first a bit confused about how this function had any visible
side effects. Then I realized that "e" is actually a reference to an
edge in the jump threading path and that seemingly dead assignment at
the end isn't really dead :-)
Jeff