https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69452
Jeffrey A. Law <law at redhat dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |richard.guenther at gmail dot com Assignee|law at redhat dot com |unassigned at gcc dot gnu.org --- Comment #2 from Jeffrey A. Law <law at redhat dot com> --- This looks like a latent bug in tree-ssa-loop-im.c introduced by Richi a few years ago when he added handling of invariant PHIs. It appears that the code assumes that a dominator walk is sufficient to order invariant statements moved. However, consider a diamond CFG a / \ b c \ / d a dominates b, c & d. Valid dominator walk orders would be a, b, c, d a, b, d, c a, c, b, d a, c, d, b a, d, b, c a, d, c, b Note carefully that we could visit d before b or c. So a pass which assumes that we always visit b & c before d for correctness is fundamentally flawed. The move_computations::before_dom_children call back is visited by the dom walker and just inserts instructions on the appropriate edge. Each instruction is inserted on the edge *when it's appropriate block is visited*. So if we have an invariant in d which depends on invariants in b & c, a domwalk can not be relied upon to order those statements. The obviously question is how can that happen in SSA? There can't be a statement in d that depends on something in b & c because b & c don't dominate d. But wait... PHI nodes. A PHI can be defined in d which has arguments defined in b & c. That's precisely what's happening with this BZ. We have a PHI in block d which is fed by values in blocks c & b. However, the order of visitation in the domwalk is b, d, c resulting in something like this: _71 = (char) pretmp_85; c_lsm.17_116 = r_75 != 0 ? _93 : _71; _93 = (char) pretmp_85; Which is obviously invalid SSA (ignore that _71 and _93 have the same value). What's key to remember here is that a domwalk will not guarantee that blocks defining arguments in a PHI will be visited before the PHI itself -- even in loop-free code. If we're going to keep the current design of the invariant-motion code, then we probably need to do something like a topological sort of the statements on the edges before we commit the edge insertions. We could also consider sorting the blocks for the dominator walk. There's been a few cases through the years where that would be helpful, but not in a really significant way, just minor stuff we'd catch earlier in the optimizer pipeline if we had a better visitation order.