On 05/16/2018 07:55 AM, Richard Biener wrote: > On Wed, 16 May 2018, Richard Biener wrote: > >> On Tue, 15 May 2018, Richard Biener wrote: >> >>> On Tue, 15 May 2018, Richard Biener wrote: >>> >>>> On Tue, 15 May 2018, Richard Biener wrote: >>>> >>>>> On Tue, 15 May 2018, Richard Biener wrote: >>>>> >>>>>> On Tue, 15 May 2018, Richard Biener wrote: >>>>>> >>>>>>> On Mon, 14 May 2018, Kugan Vivekanandarajah wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> Attached patch handles PR63185 when we reach PHI with temp != NULLL. >>>>>>>> We could see the PHI and if there isn't any uses for PHI that is >>>>>>>> interesting, we could ignore that ? >>>>>>>> >>>>>>>> Bootstrapped and regression tested on x86_64-linux-gnu. >>>>>>>> Is this OK? >>>>>>> >>>>>>> No, as Jeff said we can't do it this way. >>>>>>> >>>>>>> If we end up with multiple VDEFs in the walk of defvar immediate >>>>>>> uses we know we are dealing with a CFG fork. We can't really >>>>>>> ignore any of the paths but we have to >>>>>>> >>>>>>> a) find the merge point (and the associated VDEF) >>>>>>> b) verify for each each chain of VDEFs with associated VUSEs >>>>>>> up to that merge VDEF that we have no uses of the to classify >>>>>>> store and collect (partial) kills >>>>>>> c) intersect kill info and continue walking from the merge point >>>>>>> >>>>>>> in b) there's the optional possibility to find sinking opportunities >>>>>>> in case we have kills on some paths but uses on others. This is why >>>>>>> DSE should be really merged with (store) sinking. >>>>>>> >>>>>>> So if we want to enhance DSEs handling of branches then we need >>>>>>> to refactor the simple dse_classify_store function. Let me take >>>>>>> an attempt at this today. >>>>>> >>>>>> First (baby) step is the following - it arranges to collect the >>>>>> defs we need to continue walking from and implements trivial >>>>>> reduction by stopping at (full) kills. This allows us to handle >>>>>> the new testcase (which was already handled in the very late DSE >>>>>> pass with the help of sinking the store). >>>>>> >>>>>> I took the opportunity to kill the use_stmt parameter of >>>>>> dse_classify_store as the only user is only looking for whether >>>>>> the kills were all clobbers which I added a new parameter for. >>>>>> >>>>>> I didn't adjust the byte-tracking case fully (I'm not fully understanding >>>>>> the code in the case of a use and I'm not sure whether it's worth >>>>>> doing the def reduction with byte-tracking). >>>>>> >>>>>> Your testcase can be handled by reducing the PHI and the call def >>>>>> by seeing that the only use of a candidate def is another def >>>>>> we have already processed. Not sure if worth special-casing though, >>>>>> I'd rather have a go at "recursing". That will be the next >>>>>> patch. >>>>>> >>>>>> Bootstrap & regtest running on x86_64-unknown-linux-gnu. >>>>> >>>>> Applied. >>>>> >>>>> Another intermediate one below, fixing the byte-tracking for >>>>> stmt with uses. This also re-does the PHI handling by simply >>>>> avoiding recursion by means of a visited bitmap and stopping >>>>> at the DSE classify stmt when re-visiting it instead of failing. >>>>> This covers Pratamesh loop case for which I added ssa-dse-33.c. >>>>> For the do-while loop this still runs into the inability to >>>>> handle two defs to walk from. >>>>> >>>>> Bootstrap & regtest running on x86_64-unknown-linux-gnu. >>>> >>>> Ok, loop handling doesn't work in general since we run into the >>>> issue that SSA form across the backedge is not representing the >>>> same values. Consider >>>> >>>> <bb 3> >>>> # .MEM_22 = PHI <.MEM_12(D)(2), .MEM_13(4)> >>>> # n_20 = PHI <0(2), n_7(4)> >>>> # .MEM_13 = VDEF <.MEM_22> >>>> bytes[n_20] = _4; >>>> if (n_20 > 7) >>>> goto <bb 5>; >>>> >>>> <bb 4> >>>> n_7 = n_20 + 1; >>>> # .MEM_15 = VDEF <.MEM_13> >>>> bytes[n_20] = _5; >>>> goto <bb 3>; >>>> >>>> then when classifying the store in bb4, visiting the PHI node >>>> gets us to the store in bb3 which appears to be killing. >>>> >>>> if (gimple_code (temp) == GIMPLE_PHI) >>>> - defvar = PHI_RESULT (temp); >>>> + { >>>> + /* If we visit this PHI by following a backedge then reset >>>> + any info in ref that may refer to SSA names which we'd need >>>> + to PHI translate. */ >>>> + if (gimple_bb (temp) == gimple_bb (stmt) >>>> + || dominated_by_p (CDI_DOMINATORS, gimple_bb (stmt), >>>> + gimple_bb (temp))) >>>> + /* ??? ref->ref may not refer to SSA names or it may only >>>> + refer to SSA names that are invariant with respect to the >>>> + loop represented by this PHI node. */ >>>> + ref->ref = NULL_TREE; >>>> + defvar = PHI_RESULT (temp); >>>> + bitmap_set_bit (visited, SSA_NAME_VERSION (defvar)); >>>> + } >>>> >>>> should be a workable solution for that. I'm checking that, but >>>> eventually you can think of other things that might prevent us from >>>> handling backedges. Note the current code tries to allow >>>> looking across loops but not handle backedges of loops the >>>> original stmt belongs to. >>> >>> Just to mention before I leave for the day I think I've identified >>> a latent issue where I just fail to produce a testcase right now >>> which is that non-return exits from a function are not reliably >>> visited given they do not all have virtual operands, like >>> for example resx. Thus consider >>> >>> void foo (int *p, int x) >>> { >>> *p = 0; >>> if (x) >>> resx; >>> *p = 1; >>> return; >>> } >>> >>> where we will never visit any stmts on the resx path and thus think >>> the *p = 0 store is dead (all with current trunk of course). >>> >>> Will continue to dig tomorrow. >> >> PR85803. >> >> I am currently re-testing the following on x86_64-unknown-linux-gnu >> with --param dse-max-alias-queries-per-store=100000 in BOOT_CFLAGS. >> >> I'll refrain from doing the separate-walks-for-each-def thing >> but will try to fix PR85757 by pattern matching the >> def-has-single-virtual-use-that-is-another-def-in-the-vector case >> in which case we can remove it from further processing. > > After committing I am now testing that fix for PR85757 > on x86_64-unknown-linux-gnu. > > Richard. > > > 2018-05-16 Richard Biener <rguent...@suse.de> > > PR tree-optimization/85757 > * tree-ssa-dse.c (dse_classify_store): Record a PHI def and > remove defs that only feed that PHI from further processing. > > * gcc.dg/tree-ssa/ssa-dse-34.c: New testcase. > Seems reasonable.
Jeff