On 2/23/22 02:48, Richard Biener wrote:
On Tue, Feb 22, 2022 at 8:19 PM Andrew MacLeod via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
On 2/22/22 13:07, Jeff Law wrote:
On 2/22/2022 10:57 AM, Jakub Jelinek via Gcc-patches wrote:
On Tue, Feb 22, 2022 at 12:39:28PM -0500, Andrew MacLeod wrote:
That is EH, then there are calls that might not return because they
leave
in some other way (e.g. longjmp), or might loop forever, might
exit, might
abort, trap etc.
Generally speaking, calls which do not return should not now be a
problem...
as long as they do not transfer control to somewhere else in the
current
function.
I thought all of those cases are very relevant to PR104530.
If we have:
_1 = ptr_2(D) == 0;
// unrelated code in the same bb
_3 = *ptr_2(D);
then in light of PR104288, we can optimize ptr_2(D) == 0 into true
only if
there are no calls inside of "// unrelated code in the same bb"
or if all calls in "// unrelated code in the same bb" are guaranteed to
return exactly once. Because, if there is a call in there which could
exit (that is the PR104288 testcase), or abort, or trap, or loop
forever,
or throw externally, or longjmp or in any other non-UB way
cause the _1 = ptr_2(D) == 0; stmt to be invoked at runtime but
_3 = *ptr_2(D) not being invoked, then we can't optimize the earlier
comparison because ptr_2(D) could be NULL in a valid program.
While if there are no calls (and no problematic inline asms) and no
trapping
insns in between, we can and PR104530 is asking that we continue to
optimize
that.
Right. This is similar to some of the restrictions we deal with in
the path isolation pass. Essentially we have a path, when traversed,
would result in a *0. We would like to be able to find the edge
upon-which the *0 is control dependent and optimize the test so that
it always went to the valid path rather than the *0 path.
The problem is there may be observable side effects on the *0 path
between the test and the actual *0 -- including calls to nonreturning
functions, setjmp/longjmp, things that could trap, etc. This case is
similar. We can't back-propagate the non-null status through any
statements with observable side effects.
Jeff
We can't back propagate, but we can alter our forward view. Any
ssa-name defined before the observable side effect can be recalculated
using the updated values, and all uses of those names after the
side-effect would then appear to be "up-to-date"
This does not actually change anything before the side-effect statement,
but the lazy re-evalaution ranger employs makes it appear as if we do a
new computation when _1 is used afterwards. ie:
_1 = ptr_2(D) == 0;
// unrelated code in the same bb
_3 = *ptr_2(D);
_4 = ptr_2(D) == 0; // ptr_2 is known to be [+1, +INF] now.
And we use _4 everywhere _1 was used. This is the effect.
so we do not actually change anything in the unrelated code, just
observable effects afterwards. We already do these recalculations on
outgoing edges in other blocks, just not within the definition block
because non-null wasn't visible within the def block.
Additionally, In the testcase, there is a store to C before the side
effects.
these patches get rid of the branch and thus the call in the testcase as
requested, but we still have to compute _3 in order to store it into
global C since it occurs pre side-effect.
b.0_1 = b;
_2 = b.0_1 == 0B;
_3 = (int) _2;
c = _3;
_5 = *b.0_1;
No matter how you look at it, you are going to need to process a block
twice in order to handle any code pre-side-effect. Whether it be
assigning stmt uids, or what have you.
Yes. I thought that is what ranger already does when it discovers new
ranges from edges. Say we have
_1 = 10 / _2;
if (_2 == 1)
{
_3 = _1 + 1;
then when evaluating _1 + 1 we re-evaluate 10 / _2 using _2 == 1 and
can compute _3 to [11, 11]?
Correct, we get most of these first order effects via edges.
That obviously extends to any stmt-level ranges we discover for uses
(not defs because defs are never used upthread). And doing that is
_not_ affected by any function/BB terminating calls or EH or whatnot
as long as the updated ranges are only affecting stmts dominating the
current one.
What complicates all this reasoning is that it is straight-forward when
you work with a traditional IL walking pass but it gets hard (and possibly
easy to get wrong) with on-demand processing and caching because
everything you cache will now be context dependent (valid only
starting after stmt X and for stmts dominated by it).
Yeah, which is why this particular side effect code only applies to
definitions during a dom walk. we know we will not return to a def.
The non-null list (and next release the generalized side-effects) are
only applied to on-exit ranges via non-EH edges.. so they cant really
get us into trouble as we are sure of those values only affecting
dominated blocks. Pure on-demand clients will not get any of this
intra-block fine tuning.