104530 - proposed re-evaluation.

Andrew MacLeod via Gcc-patches Tue, 22 Feb 2022 11:18:45 -0800

On 2/22/22 13:07, Jeff Law wrote:

On 2/22/2022 10:57 AM, Jakub Jelinek via Gcc-patches wrote:
On Tue, Feb 22, 2022 at 12:39:28PM -0500, Andrew MacLeod wrote:
That is EH, then there are calls that might not return because theyleavein some other way (e.g. longjmp), or might loop forever, mightexit, might
abort, trap etc.
Generally speaking, calls which do not return should not now be aproblem...as long as they do not transfer control to somewhere else in thecurrent
function.
I thought all of those cases are very relevant to PR104530.
If we have:
   _1 = ptr_2(D) == 0;
   // unrelated code in the same bb
   _3 = *ptr_2(D);
then in light of PR104288, we can optimize ptr_2(D) == 0 into trueonly if
there are no calls inside of "// unrelated code in the same bb"
or if all calls in "// unrelated code in the same bb" are guaranteed to
return exactly once.  Because, if there is a call in there which could
exit (that is the PR104288 testcase), or abort, or trap, or loopforever,
or throw externally, or longjmp or in any other non-UB way
cause the _1 = ptr_2(D) == 0; stmt to be invoked at runtime but
_3 = *ptr_2(D) not being invoked, then we can't optimize the earlier
comparison because ptr_2(D) could be NULL in a valid program.
While if there are no calls (and no problematic inline asms) and notrappinginsns in between, we can and PR104530 is asking that we continue tooptimize
that.
Right. This is similar to some of the restrictions we deal with inthe path isolation pass. Essentially we have a path, when traversed,would result in a *0. We would like to be able to find the edgeupon-which the *0 is control dependent and optimize the test so thatit always went to the valid path rather than the *0 path.
The problem is there may be observable side effects on the *0 pathbetween the test and the actual *0 -- including calls to nonreturningfunctions, setjmp/longjmp, things that could trap, etc. This case issimilar. We can't back-propagate the non-null status through anystatements with observable side effects.
Jeff

We can't back propagate, but we can alter our forward view. Anyssa-name defined before the observable side effect can be recalculatedusing the updated values, and all uses of those names after theside-effect would then appear to be "up-to-date"

This does not actually change anything before the side-effect statement,but the lazy re-evalaution ranger employs makes it appear as if we do anew computation when _1 is used afterwards. ie:


   _1 = ptr_2(D) == 0;
   // unrelated code in the same bb
   _3 = *ptr_2(D);
   _4 = ptr_2(D) == 0;      // ptr_2 is known to be [+1, +INF] now.
And we use _4 everywhere _1 was used.   This is the effect.

so we do not actually change anything in the unrelated code, justobservable effects afterwards. We already do these recalculations onoutgoing edges in other blocks, just not within the definition blockbecause non-null wasn't visible within the def block.

Additionally, In the testcase, there is a store to C before the sideeffects.these patches get rid of the branch and thus the call in the testcase asrequested, but we still have to compute _3 in order to store it intoglobal C since it occurs pre side-effect.


    b.0_1 = b;
    _2 = b.0_1 == 0B;
    _3 = (int) _2;
    c = _3;
    _5 = *b.0_1;

No matter how you look at it, you are going to need to process a blocktwice in order to handle any code pre-side-effect. Whether it beassigning stmt uids, or what have you.

VRP could pre-process the block, and if it gets to the end of the block,and it had at least one statement with a side effect and no calls whichmay not return you could process the block with all the side effectsalready active. I'm not sure if that buys as much as the cost, but itwould change the value written to C to be 1, and it would change theglobal values exported for _2 and _3.

Another option would be flag the ssa-names instead of/as well as markingthem as stale. If we get to the end of the block and there were nonon-returning functions or EH edges, then re-calculate and export thosessa_names using the latest values.. That would export [0,0] for _2 and _3.

This would have no tangible impact during the first VRP pass, but the*next* VRP pass, (or any other ranger pass) would pick up the new globalranges, and do all the right things... so we basically let a subsequentpass pick up the info and do the dirty work.


Andrew

Re: [PATCH 0/2] tree-optimization/104530 - proposed re-evaluation.

Reply via email to