On 07/03/2018 03:31 AM, Aldy Hernandez wrote:
> On 07/02/2018 07:08 AM, Christophe Lyon wrote:
> 
>>>> On 11/07/2017 10:33 AM, Aldy Hernandez wrote:
>>>>> While poking around in the backwards threader I noticed that we bail if
>>>>>
>>>>> we have already seen a starting BB.
>>>>>
>>>>>         /* Do not jump-thread twice from the same block.  */
>>>>>         if (bitmap_bit_p (threaded_blocks, entry->src->index)
>>>>>
>>>>> This limitation discards paths that are sub-paths of paths that have
>>>>> already been threaded.
>>>>>
>>>>> The following patch scans the remaining to-be-threaded paths to identify
>>>>>
>>>>> if any of them start from the same point, and are thus sub-paths of the
>>>>>
>>>>> just-threaded path.  By removing the common prefix of blocks in upcoming
>>>>>
>>>>> threadable paths, and then rewiring first non-common block
>>>>> appropriately, we expose new threading opportunities, since we are no
>>>>> longer starting from the same BB.  We also simplify the would-be
>>>>> threaded paths, because we don't duplicate already duplicated paths.
> [snip]
>> Hi,
>>
>> I've noticed a regression on aarch64:
>> FAIL: gcc.dg/tree-ssa/ssa-dom-thread-7.c scan-tree-dump thread3 "Jumps
>> threaded: 3"
>> very likely caused by this patch (appeared between 262282 and 262294)
>>
>> Christophe
> 
> The test needs to be adjusted here.
> 
> The long story is that the aarch64 IL is different at thread3 time in
> that it has 2 profitable sub-paths that can now be threaded with my
> patch.  This is causing the threaded count to be 5 for aarch64, versus 3
> for x86 64.  Previously we couldn't thread these in aarch64, so the
> backwards threader would bail.
> 
> One can see the different threading opportunities by sticking
> debug_all_paths() at the top of thread_through_all_blocks().  You will
> notice that aarch64 has far more candidates to begin with.  The IL on
> the x86 backend, has no paths that start on the same BB.  The aarch64,
> on the other hand, has many to choose from:
> 
> path: 52 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11,
> path: 51 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16,
> path: 53 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16,
> path: 52 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11, 11 -> 35,
> path: 51 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11, 11 -> 35,
> path: 53 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 11, 11 -> 35,
> path: 52 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16, 16 -> 17,
> path: 51 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16, 16 -> 17,
> path: 53 -> 56, 56 -> 57, 57 -> 58, 58 -> 10, 10 -> 16, 16 -> 19,
> 
> Some of these prove unprofitable, but 2 more than before are profitable now.
> 
> 
> BTW, I see another threading related failure on aarch64 which is
> unrelated to my patch, and was previously there:
> 
> FAIL: gcc.dg/tree-ssa/ssa-dom-thread-7.c scan-tree-dump-not vrp2 "Jumps
> threaded"
> 
> This is probably another IL incompatibility between architectures.
> 
> Anyways... the attached path fixes the regression.  I have added a note
> to the test explaining the IL differences.  We really should rewrite all
> the threading tests (I am NOT volunteering ;-)).
> 
> OK for trunk?
> Aldy
> 
> curr.patch
> 
> 
> gcc/testsuite/
> 
>       * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Adjust test because aarch64
>       has a slightly different IL that provides more threading
>       opportunities.
OK.

WRT rewriting the tests.  I'd certainly agree that we don't have the
right set of knobs to allow us to characterize the target nor do we have
the right dumping/scanning facilities to describe and query the CFG changes.

The fact that the IL changes so much across targets is a sign that
target dependency (probably BRANCH_COST) is twiddling the gimple we
generate.  I strongly suspect we'd be a lot better off if we tackled the
BRANCH_COST problem first.

jeff

ps.  That particular test is the  test which led to the creation of the
backwards jump threader :-)

Reply via email to