On 07/03/2018 03:31 AM, Aldy Hernandez wrote: > On 07/02/2018 07:08 AM, Christophe Lyon wrote: > >>>> On 11/07/2017 10:33 AM, Aldy Hernandez wrote: >>>>> While poking around in the backwards threader I noticed that we bail if >>>>> >>>>> we have already seen a starting BB. >>>>> >>>>> /* Do not jump-thread twice from the same block. */ >>>>> if (bitmap_bit_p (threaded_blocks, entry->src->index) >>>>> >>>>> This limitation discards paths that are sub-paths of paths that have >>>>> already been threaded. >>>>> >>>>> The following patch scans the remaining to-be-threaded paths to identify >>>>> >>>>> if any of them start from the same point, and are thus sub-paths of the >>>>> >>>>> just-threaded path. By removing the common prefix of blocks in upcoming >>>>> >>>>> threadable paths, and then rewiring first non-common block >>>>> appropriately, we expose new threading opportunities, since we are no >>>>> longer starting from the same BB. We also simplify the would-be >>>>> threaded paths, because we don't duplicate already duplicated paths. > [snip] >> Hi, >> >> I've noticed a regression on aarch64: >> FAIL: gcc.dg/tree-ssa/ssa-dom-thread-7.c scan-tree-dump thread3 "Jumps >> threaded: 3" >> very likely caused by this patch (appeared between 262282 and 262294) >> >> Christophe > > The test needs to be adjusted here. > > The long story is that the aarch64 IL is different at thread3 time in > that it has 2 profitable sub-paths that can now be threaded with my > patch. This is causing the threaded count to be 5 for aarch64, versus 3 > for x86 64. Previously we couldn't thread these in aarch64, so the > backwards threader would bail. > > One can see the different threading opportunities by sticking > debug_all_paths() at the top of thread_through_all_blocks(). You will > notice that aarch64 has far more candidates to begin with. The IL on > the x86 backend, has no paths that start on the same BB. The aarch64, > on the other hand, has many to choose from: > > path: 52 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11, > path: 51 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16, > path: 53 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16, > path: 52 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11, 11 -> 35, > path: 51 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11, 11 -> 35, > path: 53 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11, 11 -> 35, > path: 52 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16, 16 -> 17, > path: 51 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16, 16 -> 17, > path: 53 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16, 16 -> 19, > > Some of these prove unprofitable, but 2 more than before are profitable now. > > > BTW, I see another threading related failure on aarch64 which is > unrelated to my patch, and was previously there: > > FAIL: gcc.dg/tree-ssa/ssa-dom-thread-7.c scan-tree-dump-not vrp2 "Jumps > threaded" > > This is probably another IL incompatibility between architectures. > > Anyways... the attached path fixes the regression. I have added a note > to the test explaining the IL differences. We really should rewrite all > the threading tests (I am NOT volunteering ;-)). > > OK for trunk? > Aldy > > curr.patch > > > gcc/testsuite/ > > * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Adjust test because aarch64 > has a slightly different IL that provides more threading > opportunities. OK.
WRT rewriting the tests. I'd certainly agree that we don't have the right set of knobs to allow us to characterize the target nor do we have the right dumping/scanning facilities to describe and query the CFG changes. The fact that the IL changes so much across targets is a sign that target dependency (probably BRANCH_COST) is twiddling the gimple we generate. I strongly suspect we'd be a lot better off if we tackled the BRANCH_COST problem first. jeff ps. That particular test is the test which led to the creation of the backwards jump threader :-)