On 05/23/2018 03:20 AM, Prathamesh Kulkarni wrote:
> On 23 May 2018 at 13:58, Richard Biener <rguent...@suse.de> wrote:
>> On Wed, 23 May 2018, Prathamesh Kulkarni wrote:
>>
>>> Hi,
>>> I am trying to work on PR80155, which exposes a problem with code
>>> hoisting and register pressure on a leading embedded benchmark for ARM
>>> cortex-m7, where code-hoisting causes an extra register spill.
>>>
>>> I have attached two test-cases which (hopefully) are representative of
>>> the original test-case.
>>> The first one (trans_dfa.c) is bigger and somewhat similar to the
>>> original test-case and trans_dfa_2.c is hand-reduced version of
>>> trans_dfa.c. There's 2 spills caused with trans_dfa.c
>>> and one spill with trans_dfa_2.c due to lesser amount of cases.
>>> The test-cases in the PR are probably not relevant.
>>>
>>> Initially I thought the spill was happening because of "too many
>>> hoistings" taking place in original test-case thus increasing the
>>> register pressure, but it seems the spill is possibly caused because
>>> expression gets hoisted out of a block that is on loop exit.
>>>
>>> For example, the following hoistings take place with trans_dfa_2.c:
>>>
>>> (1) Inserting expression in block 4 for code hoisting:
>>> {mem_ref<0B>,tab_20(D)}@.MEM_45 (0005)
>>>
>>> (2) Inserting expression in block 4 for code hoisting: {plus_expr,_4,1} 
>>> (0006)
>>>
>>> (3) Inserting expression in block 4 for code hoisting:
>>> {pointer_plus_expr,s_33,1} (0023)
>>>
>>> (4) Inserting expression in block 3 for code hoisting:
>>> {pointer_plus_expr,s_33,1} (0023)
>>>
>>> The issue seems to be hoisting of (*tab + 1) which consists of first
>>> two hoistings in block 4
>>> from blocks 5 and 9, which causes the extra spill. I verified that by
>>> disabling hoisting into block 4,
>>> which resulted in no extra spills.
>>>
>>> I wonder if that's because the expression (*tab + 1) is getting
>>> hoisted from blocks 5 and 9,
>>> which are on loop exit ? So the expression that was previously
>>> computed in a block on loop exit, gets hoisted outside that block
>>> which possibly makes the allocator more defensive ? Similarly
>>> disabling hoisting of expressions which appeared in blocks on loop
>>> exit in original test-case prevented the extra spill. The other
>>> hoistings didn't seem to matter.
>>
>> I think that's simply co-incidence.  The only thing that makes
>> a block that also exits from the loop special is that an
>> expression could be sunk out of the loop and hoisting (commoning
>> with another path) could prevent that.  But that isn't what is
>> happening here and it would be a pass ordering issue as
>> the sinking pass runs only after hoisting (no idea why exactly
>> but I guess there are cases where we want to prefer CSE over
>> sinking).  So you could try if re-ordering PRE and sinking helps
>> your testcase.
> Thanks for the suggestions. Placing sink pass before PRE works
> for both these test-cases! Sadly it still causes the spill for the benchmark 
> -:(
> I will try to create a better approximation of the original test-case.
>>
>> What I do see is a missed opportunity to merge the successors
>> of BB 4.  After PRE we have
>>
>> <bb 4> [local count: 159303558]:
>> <L1>:
>> pretmp_123 = *tab_37(D);
>> _87 = pretmp_123 + 1;
>> if (c_36 == 65)
>>   goto <bb 5>; [34.00%]
>> else
>>   goto <bb 8>; [66.00%]
>>
>> <bb 5> [local count: 54163210]:
>> *tab_37(D) = _87;
>> _96 = MEM[(char *)s_57 + 1B];
>> if (_96 != 0)
>>   goto <bb 7>; [89.00%]
>> else
>>   goto <bb 6>; [11.00%]
>>
>> <bb 8> [local count: 105140348]:
>> *tab_37(D) = _87;
>> _56 = MEM[(char *)s_57 + 1B];
>> if (_56 != 0)
>>   goto <bb 10>; [89.00%]
>> else
>>   goto <bb 9>; [11.00%]
>>
>> here at least the stores and loads can be hoisted.  Note this
>> may also point at the real issue of the code hoisting which is
>> tearing apart the RMW operation?
> Indeed, this possibility seems much more likely than block being on loop exit.
> I will try to "hardcode" the load/store hoists into block 4 for this
> specific test-case to check
> if that prevents the spill.
Even if it prevents the spill in this case, it's likely a good thing to
do.  The statements prior to the conditional in bb5 and bb8 should be
hoisted, leaving bb5 and bb8 with just their conditionals.

Jeff

Reply via email to