On April 24, 2019 8:31:44 PM GMT+02:00, Jeff Law <l...@redhat.com> wrote: >On 4/24/19 4:44 AM, Richard Biener wrote: >> On Tue, Apr 23, 2019 at 4:29 PM Jeff Law <l...@redhat.com> wrote: >>> >>> >>> As discussed in the BZ, this patch addresses the false positive >warning >>> by cleaning up the const/copy propagations left in the IL between >DOM's >>> jump threading and erroneous path isolation. >>> >>> In the past we'd been handling this stuff with phi only cprop. To >make >>> phi only cprop work in this case we'd have to change it to scan >>> statements within some subset of blocks where it had previously only >>> scanned PHIs. >>> >>> I was concerned about the compile-time cost of the additional >scanning >>> plus the extra pass. So I compared that against using the lattice >based >>> const/copy propagation as well as against the RPO VN bits. >>> >>> It turns out that the lattice based const/copy propagation does the >>> least amount of work, closely followed by a limited RPO VN. The >>> improved "phi only" copy propagator being the worst. >> >> Interesting. >Definitely. I didn't dig further than what's mentioned in the BZ. > >> >>> Given that we can use the lattice copy propagator by just adding the >>> pass to passes.def whereas using the RPN VN actually requires a >little >>> bit of real code (to set up the entry/exits for the relevant SEME >>> regions), I went with the lattice copy propagator. >>> >>> This change adds around .4% instruction executions to my testbed of >.i >>> files. It has no significant impact on the resulting code -- I see >>> different register allocation decisions in a lot of places which >seem to >>> primarily result in reversing arguments to comparisons. >> >> Was there a need to have two copy-prop passes in the early >> DOM/errorneous-path removal where we previously only had >> a single phi-only-prop pass? Is the testcase fixed also when >> doing copy-prop only a single time? >The testcase is fixed with a single copyprop (lattice or RPO VN) after >DOM but before erroneous path isolation. I seriously considered just >dropping the copyprop pass after erroneous path isolation. I'm pretty >sure it'll regress codegen quality, but it may not be too bad. > > >> >> Also the late pass you replace should be right after VRP, not >> after warn_restrict (that was a mistake done when adding the >> warn_restrict pass). >Agreed. But ISTM we should make that an independent change. > >> >> The main reason I dislike this is that it is an unconditional cleanup >> pass run even when we didn't perform any jump threading. That's >> of course the same case as with the existing phi-only-prop passes >> but would have been my major argument for the SEME VN >> (where at some cut-off of course doing a single whole-function VN >> is cheaper than doing N SEME VNs, and I can even think of doing >> N SEME regions at once in exchange for doing a whole-function >> RPO order compute). >Yea. We're certainly doing more work in the cases where we didn't >thread anything. I've always wanted better ways to indicate what >actions a pass did and using that to bypass subsequent passes if they >weren't going to be profitable. > > > > >> >>> FWIW I also considered delaying the erroneous path isolation pass. >I >>> ultimately decided against that. My recollection is that its >location >>> came from the desire to clean up those paths early enough in the >>> pipeline so that other optimizers could do a better job. >>> >>> We could consider an early/late split here. Essentially we do the >>> optimization early, leaving enough data lying around somewhere for a >>> late pass to look for and issue the warning. Something like a >>> __builtin_warning that we leave in the IL, then just before >expanding to >>> RTL, we scan the IL once and issue the __builtin_warnings. >>> >>> In this specific case the __builtin_warning would be on a path that >we >>> eventually would determine was unreachable at compile time and the >path >>> would be removed. I suspect we could do something similar for other >>> warnings coming out of the middle end. >>> >>> Anyway, this has been bootstrapped and regression tested on x86_64, >>> ppc64, i686, sparc64, & ppc64le. It's also been bootstrapped on >alpha, >>> ppc (32 bit), armeb, m68k, riscv64, mipsisa32r4, arm, sh4, & sh4eb. >>> It's also built and regression tested all the *-elf targets in my >tester. >>> >>> OK for the trunk, or do we want to defer to gcc-10? >> >> I like the pass removal and would say OK if you manage with >> a 1:1 replacement (thus get rid of that extra copy-prop between >> DOM and pass_isolate_erroneous_paths). >That's the one we need to fix the regression. We might be able to drop >the one after erroneous path isolation which would keep us at the 1:1 >replacement. I'll poke at that.
Works for me as well if testing succeeds. Richard. >jeff