[Bug tree-optimization/61757] [4.10 Regression] genmodes failure with enable-checking

law at redhat dot com Thu, 15 Jan 2015 21:37:50 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61757


--- Comment #34 from Jeffrey A. Law <law at redhat dot com> ---
Whee, fun, finally getting a chance to investigate this BZ.  Namely why does
canonicalization of equivalences in DOM effect correctness of jump threading.

BB12 is what's interesting here.  It's the head of the loop.  It has two
predecessors.  BB7 which is outside the loop and BB6 which is inside the loop. 
BB7->BB12 is the loop entry and BB6->BB12 is the latch edge, naturally.

# i_9 = PHI <_3(7), i_12(6)>
# prephitmp_16 = PHI <q_8(D)(7), pretmp_11(6)>
i_17 = i_9 + 4294967295;
if (q_8(D) == prephitmp_16)
  goto <bb 4>;
else
  goto <bb 8>;

When DOM enters BB12 (via BB7) it does not record anything from the PHI since
the block has multiple predecessors and the PHI is not a degenerate (ie, the
PHI doesn't generate an equivalence that holds every time BB12 executes).  When
DOM gets to the conditional, it'll record that on the true edge that _8 and _16
are equivalent.

It's important to realize that _16 is clearly defined in the loop and that _8
comes from outside the loop (it's actually a default definition).

Eventually DOM "traverses" that true edge and we call record_equivalence (which
has the loop depth canonicalization).  It can record either _8 = _16 or _16 =
_8 (OK, I guess it could record both).  Without canonicalization we can end up
with _8 = 16, ie, an object from outside the loop is defined as equivalent to
an object inside the loop.

That is bad because when we traverse the loop back edge during threading,
objects in the loop can change their values and must trigger invalidations.

Invalidation of equivalences the threader is designed on the assumption that
nothing outside the loop is ever equivalent to something inside the loop.  In
effect we assume that all we need to do is either record a new equivalence or
invalidate equivalences on every SSA_NAME assigned in the loop as we process
the assignment.

If we have to handle cases where something outside the loop is recorded as
equivalent to something inside the loop, then we'd do to an invalidate for
every ssa name in the loop, then record any newly discovered equivalences.

I tried real hard to avoid that because of the cost of invalidation.  Granted
the cost of invalidation now is dramatically smaller, I'd still prefer not to
have to do so many invalidations.

Given that I want to look at ripping all the backedge stuff out in favor of the
on-demand backwards walk approach we're doing for FSMs, I think our best bet is
to leave things as-is for gcc-5.

The comment in the code references back to this BZ, so I don't think we need to
update the comment further.

[Bug tree-optimization/61757] [4.10 Regression] genmodes failure with enable-checking

Reply via email to