On Fri, Oct 26, 2012 at 8:30 AM, Alexandre Oliva <aol...@redhat.com> wrote: > Both jump threading and loop induction variable optimizations were > dropping useful debug information, and it took improvements in both for > debug info about relevant variables in the enclosed testcase to survive > all the way to the end. > > The first problem was that jump threading could bypass blocks containing > debug stmts, losing the required bindings. The solution was to > propagate bypassed debug insns into the target of the bypass. Even if > the new confluence ends up introducing new PHI nodes, the propagated > debug binds will resolve to them, just as intended. This is implemented > in the 4th patch below: vta-jump-thread-prop-debug-pr54693.patch > > The second part of the problem was that, when performing loop ivopts, > we'd often drop PHI nodes and other SSA names for unused ivs. Although > we had code to propagate SSA DEFs into debug uses upon their removal, > this couldn't take care of PHI nodes (no debug phis or conditional debug > binds), and it was precisely at the removal of a PHI node that we > dropped debug info for the loop in the provided testcase. Once Jakub > figured out how to compute an unused iv out of available ivs (thanks!), > it was easy to introduce debug temps with the expressions to compute > them, so that debug binds would remain correct as long as the unused iv > can still be computed out of the others. (If it can't, we'll still try > propagation, but may end up losing at the PHI nodes). I had thought > that replacing only the PHI nodes would suffice, but it turned out that > replacing all unused iv defs with their equivalent used-IV expressions > got us better coverage, so this is what the 5th patch below does: > vta-ivopts-replace-dropped-phis-pr54693.patch
+ if (count > 1) + { + tree vexpr = make_node (DEBUG_EXPR_DECL); + DECL_ARTIFICIAL (vexpr) = 1; + TREE_TYPE (vexpr) = TREE_TYPE (comp); + if (SSA_NAME_VAR (def)) + DECL_MODE (vexpr) = DECL_MODE (SSA_NAME_VAR (def)); + else + DECL_MODE (vexpr) = TYPE_MODE (TREE_TYPE (vexpr)); simply always use the TREE_TYPE path. TYPE_MODE is always valid for SSA_NAMEs. + FOR_EACH_IMM_USE_STMT (stmt, imm_iter, def) + { + if (!gimple_debug_bind_p (stmt)) + continue; + + FOR_EACH_IMM_USE_ON_STMT (use_p, imm_iter) + SET_USE (use_p, comp); + + if (!comp) + BREAK_FROM_IMM_USE_STMT (imm_iter); how does comp magically become NULL_TREE here? Btw, what's all the fuzz with IV candidates, etc? At least for non-PHIs I don't see why the regular release_ssa_name way of doing things does not work. IVOPTs is slow enough ... Richard. > > Regression testing revealed -fcompare-debug regressions exposed by these > patches. > > x86-specific code introduces pre-reload scheduling dependencies between > calls and likely-spilled parameter registers, but it does so in a way > that's AFAICT buggy, and fragile to the presence of debug insns at the > top of a block: we won't introduce a dependency for the first insn of > the block, even if we'd rather have such a dependency. This fails to > achieve the intended effect, and it also causes codegen differences when > the first insn in the block happens to be a debug insn, for then we will > add the intended dependency. The first patch below, > vta-stabilize-i386-call-args-sched-pr54693.patch, skips leading debug > insns, so as to remove the difference, and moves the end of the backward > scan to the insn before the first actual insn, so that we don't refrain > from considering it for dependencies. This in turn required an > additional test to make sure we wouldn't go past the nondebug head if > first_arg happened to be the head. > > The introduction of debug insns at new spots also exposed a bug in loop > unrolling: we'd unroll a loop a different number of times depending on > whether or not its latch is an empty block. The propagation or > introduction of debug insns in previously-empty latch blocks caused > loops to be unrolled a different number of times depending on the > presence of the debug insns, which triggered -fcompare-debug errors. > The fix was to count only nondebug insns towards the decision on whether > the latch block was empty. This is implemented in the second patch > below: vta-stabilize-loop-unroll-empty-latch-check-pr54693.patch > > Guality testsuite regressions given the patches above revealed that the > fast DCE global dead debug substitution introduced for PR54551 was not > correct: it was possible for us to visit, for the last time, a block > with a REG used in debug stmts after its death before we visited the > block with the debug use. As a result, we'd fail to emit a debug temp > at the not-revisited block, and the debug temp bindings introduced at > other blocks might be insufficient to get a value to the debug use > point, or even get an incorrect value there. I've fixed this problem by > using the DU chains at the time we realize a dead debug use uses a value > from a different block, adding at that moment debug temps at all defs of > the REG and replacing all debug uses with the debug temp, and arranging > that we don't do that again for the same REG. This ensures that, > regardless of the order in which we visit blocks, we'll get correct > debug temp bindings. This fix is implemented in the 3rd patch below: > vta-dce-globalize-debug-review-pr54551-pr54693.patch > > While looking into some crashes due to a bug in an earlier version of > the patch described above, I suspected the problem had to do with our DF > rescanning newly-emitted debug temps right away. I know SSA updating > messes with SSA def/use chains, and I suspected DF rescanning might > invalidate def/use chains as well. It turned out that the problem was > unrelated, but I kind of liked moving the initialization of the > to_rescan bitmap out of the loop over uses, and deferring the rescanning > of the new debug temp with it. However, since that's not a required > part of the proposed fixes, I split it out into a separate patch, the > 6th and last below: vta-valtrack-defer-debug-temp-rescan-pr54693.patch > > The patches were regstrapped together, on i686- and x86_64-linux-gnu, > and the only regression was guality/pr43051 on i686: a debug temp would > now be reset by the scheduler as it moved a def of a REG before a debug > temp that used the old value of the REG. I suppose we could improve > sched so as to try and emit a debug temp before the moved-ahead insn, > and then replace the REG with the debug temp in the debug use, instead > of resetting it, but since this is not exactly trivial, it's not clear > how much benefit it would bring and at what cost, and this patchset had > already been cooking for a while, I figured I'd stop at this point and > post the patchset with this caveat. > > Ok to install? > > > > > > -- > Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/ > You must be the change you wish to see in the world. -- Gandhi > Be Free! -- http://FSFLA.org/ FSF Latin America board member > Free Software Evangelist Red Hat Brazil Compiler Engineer >