On Tue, Feb 22, 2022 at 8:19 PM Andrew MacLeod via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > On 2/22/22 13:07, Jeff Law wrote: > > > > > > On 2/22/2022 10:57 AM, Jakub Jelinek via Gcc-patches wrote: > >> On Tue, Feb 22, 2022 at 12:39:28PM -0500, Andrew MacLeod wrote: > >>>> That is EH, then there are calls that might not return because they > >>>> leave > >>>> in some other way (e.g. longjmp), or might loop forever, might > >>>> exit, might > >>>> abort, trap etc. > >>> Generally speaking, calls which do not return should not now be a > >>> problem... > >>> as long as they do not transfer control to somewhere else in the > >>> current > >>> function. > >> I thought all of those cases are very relevant to PR104530. > >> If we have: > >> _1 = ptr_2(D) == 0; > >> // unrelated code in the same bb > >> _3 = *ptr_2(D); > >> then in light of PR104288, we can optimize ptr_2(D) == 0 into true > >> only if > >> there are no calls inside of "// unrelated code in the same bb" > >> or if all calls in "// unrelated code in the same bb" are guaranteed to > >> return exactly once. Because, if there is a call in there which could > >> exit (that is the PR104288 testcase), or abort, or trap, or loop > >> forever, > >> or throw externally, or longjmp or in any other non-UB way > >> cause the _1 = ptr_2(D) == 0; stmt to be invoked at runtime but > >> _3 = *ptr_2(D) not being invoked, then we can't optimize the earlier > >> comparison because ptr_2(D) could be NULL in a valid program. > >> While if there are no calls (and no problematic inline asms) and no > >> trapping > >> insns in between, we can and PR104530 is asking that we continue to > >> optimize > >> that. > > Right. This is similar to some of the restrictions we deal with in > > the path isolation pass. Essentially we have a path, when traversed, > > would result in a *0. We would like to be able to find the edge > > upon-which the *0 is control dependent and optimize the test so that > > it always went to the valid path rather than the *0 path. > > > > The problem is there may be observable side effects on the *0 path > > between the test and the actual *0 -- including calls to nonreturning > > functions, setjmp/longjmp, things that could trap, etc. This case is > > similar. We can't back-propagate the non-null status through any > > statements with observable side effects. > > > > Jeff > > > We can't back propagate, but we can alter our forward view. Any > ssa-name defined before the observable side effect can be recalculated > using the updated values, and all uses of those names after the > side-effect would then appear to be "up-to-date" > > This does not actually change anything before the side-effect statement, > but the lazy re-evalaution ranger employs makes it appear as if we do a > new computation when _1 is used afterwards. ie: > > _1 = ptr_2(D) == 0; > // unrelated code in the same bb > _3 = *ptr_2(D); > _4 = ptr_2(D) == 0; // ptr_2 is known to be [+1, +INF] now. > And we use _4 everywhere _1 was used. This is the effect. > > so we do not actually change anything in the unrelated code, just > observable effects afterwards. We already do these recalculations on > outgoing edges in other blocks, just not within the definition block > because non-null wasn't visible within the def block. > > Additionally, In the testcase, there is a store to C before the side > effects. > these patches get rid of the branch and thus the call in the testcase as > requested, but we still have to compute _3 in order to store it into > global C since it occurs pre side-effect. > > b.0_1 = b; > _2 = b.0_1 == 0B; > _3 = (int) _2; > c = _3; > _5 = *b.0_1; > > No matter how you look at it, you are going to need to process a block > twice in order to handle any code pre-side-effect. Whether it be > assigning stmt uids, or what have you.
Yes. I thought that is what ranger already does when it discovers new ranges from edges. Say we have _1 = 10 / _2; if (_2 == 1) { _3 = _1 + 1; then when evaluating _1 + 1 we re-evaluate 10 / _2 using _2 == 1 and can compute _3 to [11, 11]? That obviously extends to any stmt-level ranges we discover for uses (not defs because defs are never used upthread). And doing that is _not_ affected by any function/BB terminating calls or EH or whatnot as long as the updated ranges are only affecting stmts dominating the current one. What complicates all this reasoning is that it is straight-forward when you work with a traditional IL walking pass but it gets hard (and possibly easy to get wrong) with on-demand processing and caching because everything you cache will now be context dependent (valid only starting after stmt X and for stmts dominated by it). > VRP could pre-process the block, and if it gets to the end of the block, > and it had at least one statement with a side effect and no calls which > may not return you could process the block with all the side effects > already active. I'm not sure if that buys as much as the cost, but it > would change the value written to C to be 1, and it would change the > global values exported for _2 and _3. > > Another option would be flag the ssa-names instead of/as well as marking > them as stale. If we get to the end of the block and there were no > non-returning functions or EH edges, then re-calculate and export those > ssa_names using the latest values.. That would export [0,0] for _2 and _3. > > This would have no tangible impact during the first VRP pass, but the > *next* VRP pass, (or any other ranger pass) would pick up the new global > ranges, and do all the right things... so we basically let a subsequent > pass pick up the info and do the dirty work. > > Andrew > > > > > > > > >