On Thu, Feb 04, 2016 at 10:11:53AM +0000, Bin Cheng wrote: > Hi, > There is a performance regression caused by my previous change to > aarch64_legitimize_address, in which I forced constant offset out of memory > ref if the address expr is in the form of "reg1 + reg2 << scale + const". > The intention is to reveal as many loop-invariant opportunities as possible, > while depend on GIMPLE optimizers picking up CSE opportunities of "reg << > scale" among different memory references. > > Though the assumption still holds, gimple optimizers are not powerful enough > to pick up CSE opportunities of register scaling expressions at current time. > Here comes a workaround: this patch forces register scaling expression out of > memory ref, so that RTL CSE pass can handle common register scaling > expressions issue, of course, at a cost of possibly missed loop invariants. > > James and I collected perf data, fortunately this change can improve > performance for several cases in various benchmarks, while doesn't cause big > regression. It also recovers big regression we observed before for the > previous change. > > I also added comment explaining why the workaround is necessary. I also > files PR69653 as an example showing tree optimizer should be improved. > > Bootstrap and test on AArch64, is it OK?
OK. Thanks, James > > Thanks, > bin > > > 2016-02-04 Bin Cheng <bin.ch...@arm.com> > > * config/aarch64/aarch64.c (aarch64_legitimize_address): Force > register scaling out of memory reference and comment why.