On Tue, Apr 23, 2019 at 4:29 PM Jeff Law <l...@redhat.com> wrote: > > > As discussed in the BZ, this patch addresses the false positive warning > by cleaning up the const/copy propagations left in the IL between DOM's > jump threading and erroneous path isolation. > > In the past we'd been handling this stuff with phi only cprop. To make > phi only cprop work in this case we'd have to change it to scan > statements within some subset of blocks where it had previously only > scanned PHIs. > > I was concerned about the compile-time cost of the additional scanning > plus the extra pass. So I compared that against using the lattice based > const/copy propagation as well as against the RPO VN bits. > > It turns out that the lattice based const/copy propagation does the > least amount of work, closely followed by a limited RPO VN. The > improved "phi only" copy propagator being the worst.
Interesting. > Given that we can use the lattice copy propagator by just adding the > pass to passes.def whereas using the RPN VN actually requires a little > bit of real code (to set up the entry/exits for the relevant SEME > regions), I went with the lattice copy propagator. > > This change adds around .4% instruction executions to my testbed of .i > files. It has no significant impact on the resulting code -- I see > different register allocation decisions in a lot of places which seem to > primarily result in reversing arguments to comparisons. Was there a need to have two copy-prop passes in the early DOM/errorneous-path removal where we previously only had a single phi-only-prop pass? Is the testcase fixed also when doing copy-prop only a single time? Also the late pass you replace should be right after VRP, not after warn_restrict (that was a mistake done when adding the warn_restrict pass). The main reason I dislike this is that it is an unconditional cleanup pass run even when we didn't perform any jump threading. That's of course the same case as with the existing phi-only-prop passes but would have been my major argument for the SEME VN (where at some cut-off of course doing a single whole-function VN is cheaper than doing N SEME VNs, and I can even think of doing N SEME regions at once in exchange for doing a whole-function RPO order compute). > FWIW I also considered delaying the erroneous path isolation pass. I > ultimately decided against that. My recollection is that its location > came from the desire to clean up those paths early enough in the > pipeline so that other optimizers could do a better job. > > We could consider an early/late split here. Essentially we do the > optimization early, leaving enough data lying around somewhere for a > late pass to look for and issue the warning. Something like a > __builtin_warning that we leave in the IL, then just before expanding to > RTL, we scan the IL once and issue the __builtin_warnings. > > In this specific case the __builtin_warning would be on a path that we > eventually would determine was unreachable at compile time and the path > would be removed. I suspect we could do something similar for other > warnings coming out of the middle end. > > Anyway, this has been bootstrapped and regression tested on x86_64, > ppc64, i686, sparc64, & ppc64le. It's also been bootstrapped on alpha, > ppc (32 bit), armeb, m68k, riscv64, mipsisa32r4, arm, sh4, & sh4eb. > It's also built and regression tested all the *-elf targets in my tester. > > OK for the trunk, or do we want to defer to gcc-10? I like the pass removal and would say OK if you manage with a 1:1 replacement (thus get rid of that extra copy-prop between DOM and pass_isolate_erroneous_paths). Richard. > > Jeff > > >