On Thu, Feb 04, 2016 at 10:11:53AM +0000, Bin Cheng wrote:
> Hi,
> There is a performance regression caused by my previous change to
> aarch64_legitimize_address, in which I forced constant offset out of memory
> ref if the address expr is in the form of "reg1 + reg2 << scale + const".
> The intention is to reveal as many loop-invariant opportunities as possible,
> while depend on GIMPLE optimizers picking up CSE opportunities of "reg <<
> scale" among different memory references.
> 
> Though the assumption still holds, gimple optimizers are not powerful enough
> to pick up CSE opportunities of register scaling expressions at current time.
> Here comes a workaround: this patch forces register scaling expression out of
> memory ref, so that RTL CSE pass can handle common register scaling
> expressions issue, of course, at a cost of possibly missed loop invariants.
> 
> James and I collected perf data, fortunately this change can improve
> performance for several cases in various benchmarks, while doesn't cause big
> regression.  It also recovers big regression we observed before for the
> previous change.
> 
> I also added comment explaining why the workaround is necessary.  I also
> files PR69653 as an example showing tree optimizer should be improved.
> 
> Bootstrap and test on AArch64, is it OK?

OK.

Thanks,
James

> 
> Thanks,
> bin
> 
> 
> 2016-02-04  Bin Cheng  <bin.ch...@arm.com>
> 
>       * config/aarch64/aarch64.c (aarch64_legitimize_address): Force
>       register scaling out of memory reference and comment why.

Reply via email to