On 05/16/2018 07:55 AM, Richard Biener wrote:
> On Wed, 16 May 2018, Richard Biener wrote:
> 
>> On Tue, 15 May 2018, Richard Biener wrote:
>>
>>> On Tue, 15 May 2018, Richard Biener wrote:
>>>
>>>> On Tue, 15 May 2018, Richard Biener wrote:
>>>>
>>>>> On Tue, 15 May 2018, Richard Biener wrote:
>>>>>
>>>>>> On Tue, 15 May 2018, Richard Biener wrote:
>>>>>>
>>>>>>> On Mon, 14 May 2018, Kugan Vivekanandarajah wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Attached patch handles PR63185 when we reach PHI with temp != NULLL.
>>>>>>>> We could see the PHI and if there isn't any uses for PHI that is
>>>>>>>> interesting, we could ignore that ?
>>>>>>>>
>>>>>>>> Bootstrapped and regression tested on x86_64-linux-gnu.
>>>>>>>> Is this OK?
>>>>>>>
>>>>>>> No, as Jeff said we can't do it this way.
>>>>>>>
>>>>>>> If we end up with multiple VDEFs in the walk of defvar immediate
>>>>>>> uses we know we are dealing with a CFG fork.  We can't really
>>>>>>> ignore any of the paths but we have to
>>>>>>>
>>>>>>>  a) find the merge point (and the associated VDEF)
>>>>>>>  b) verify for each each chain of VDEFs with associated VUSEs
>>>>>>>     up to that merge VDEF that we have no uses of the to classify
>>>>>>>     store and collect (partial) kills
>>>>>>>  c) intersect kill info and continue walking from the merge point
>>>>>>>
>>>>>>> in b) there's the optional possibility to find sinking opportunities
>>>>>>> in case we have kills on some paths but uses on others.  This is why
>>>>>>> DSE should be really merged with (store) sinking.
>>>>>>>
>>>>>>> So if we want to enhance DSEs handling of branches then we need
>>>>>>> to refactor the simple dse_classify_store function.  Let me take
>>>>>>> an attempt at this today.
>>>>>>
>>>>>> First (baby) step is the following - it arranges to collect the
>>>>>> defs we need to continue walking from and implements trivial
>>>>>> reduction by stopping at (full) kills.  This allows us to handle
>>>>>> the new testcase (which was already handled in the very late DSE
>>>>>> pass with the help of sinking the store).
>>>>>>
>>>>>> I took the opportunity to kill the use_stmt parameter of
>>>>>> dse_classify_store as the only user is only looking for whether
>>>>>> the kills were all clobbers which I added a new parameter for.
>>>>>>
>>>>>> I didn't adjust the byte-tracking case fully (I'm not fully understanding
>>>>>> the code in the case of a use and I'm not sure whether it's worth
>>>>>> doing the def reduction with byte-tracking).
>>>>>>
>>>>>> Your testcase can be handled by reducing the PHI and the call def
>>>>>> by seeing that the only use of a candidate def is another def
>>>>>> we have already processed.  Not sure if worth special-casing though,
>>>>>> I'd rather have a go at "recursing".  That will be the next
>>>>>> patch.
>>>>>>
>>>>>> Bootstrap & regtest running on x86_64-unknown-linux-gnu.
>>>>>
>>>>> Applied.
>>>>>
>>>>> Another intermediate one below, fixing the byte-tracking for
>>>>> stmt with uses.  This also re-does the PHI handling by simply
>>>>> avoiding recursion by means of a visited bitmap and stopping
>>>>> at the DSE classify stmt when re-visiting it instead of failing.
>>>>> This covers Pratamesh loop case for which I added ssa-dse-33.c.
>>>>> For the do-while loop this still runs into the inability to
>>>>> handle two defs to walk from.
>>>>>
>>>>> Bootstrap & regtest running on x86_64-unknown-linux-gnu.
>>>>
>>>> Ok, loop handling doesn't work in general since we run into the
>>>> issue that SSA form across the backedge is not representing the
>>>> same values.  Consider
>>>>
>>>>  <bb 3>
>>>>  # .MEM_22 = PHI <.MEM_12(D)(2), .MEM_13(4)>
>>>>  # n_20 = PHI <0(2), n_7(4)>
>>>>  # .MEM_13 = VDEF <.MEM_22>
>>>>  bytes[n_20] = _4;
>>>>  if (n_20 > 7)
>>>>    goto <bb 5>;
>>>>
>>>>  <bb 4>
>>>>  n_7 = n_20 + 1;
>>>>  # .MEM_15 = VDEF <.MEM_13>
>>>>  bytes[n_20] = _5;
>>>>  goto <bb 3>;
>>>>
>>>> then when classifying the store in bb4, visiting the PHI node
>>>> gets us to the store in bb3 which appears to be killing.
>>>>
>>>>        if (gimple_code (temp) == GIMPLE_PHI)
>>>> -       defvar = PHI_RESULT (temp);
>>>> +       {
>>>> +         /* If we visit this PHI by following a backedge then reset
>>>> +            any info in ref that may refer to SSA names which we'd need
>>>> +            to PHI translate.  */
>>>> +         if (gimple_bb (temp) == gimple_bb (stmt)
>>>> +             || dominated_by_p (CDI_DOMINATORS, gimple_bb (stmt),
>>>> +                                gimple_bb (temp)))
>>>> +           /* ???  ref->ref may not refer to SSA names or it may only
>>>> +              refer to SSA names that are invariant with respect to the
>>>> +              loop represented by this PHI node.  */
>>>> +           ref->ref = NULL_TREE;
>>>> +         defvar = PHI_RESULT (temp);
>>>> +         bitmap_set_bit (visited, SSA_NAME_VERSION (defvar));
>>>> +       }
>>>>
>>>> should be a workable solution for that.  I'm checking that, but
>>>> eventually you can think of other things that might prevent us from
>>>> handling backedges.  Note the current code tries to allow
>>>> looking across loops but not handle backedges of loops the
>>>> original stmt belongs to.
>>>
>>> Just to mention before I leave for the day I think I've identified
>>> a latent issue where I just fail to produce a testcase right now
>>> which is that non-return exits from a function are not reliably
>>> visited given they do not all have virtual operands, like
>>> for example resx.  Thus consider
>>>
>>> void foo (int *p, int x)
>>> {
>>>   *p = 0;
>>>   if (x)
>>>     resx;
>>>   *p = 1;
>>>   return;
>>> }
>>>
>>> where we will never visit any stmts on the resx path and thus think
>>> the *p = 0 store is dead (all with current trunk of course).
>>>
>>> Will continue to dig tomorrow.
>>
>> PR85803.
>>
>> I am currently re-testing the following on x86_64-unknown-linux-gnu
>> with --param dse-max-alias-queries-per-store=100000 in BOOT_CFLAGS.
>>
>> I'll refrain from doing the separate-walks-for-each-def thing
>> but will try to fix PR85757 by pattern matching the
>> def-has-single-virtual-use-that-is-another-def-in-the-vector case
>> in which case we can remove it from further processing.
> 
> After committing I am now testing that fix for PR85757
> on x86_64-unknown-linux-gnu.
> 
> Richard.
> 
> 
> 2018-05-16  Richard Biener  <rguent...@suse.de>
> 
>       PR tree-optimization/85757
>       * tree-ssa-dse.c (dse_classify_store): Record a PHI def and
>       remove defs that only feed that PHI from further processing.
> 
>       * gcc.dg/tree-ssa/ssa-dse-34.c: New testcase.
>
Seems reasonable.

Jeff

Reply via email to