[Bug target/62173] [5.0 regression] [AArch64] Performance regression due to r213488

jiwang at gcc dot gnu.org Mon, 24 Nov 2014 15:01:52 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62173


--- Comment #10 from Jiong Wang <jiwang at gcc dot gnu.org> ---
Finished a further investigation, looks like the simplest fix to genrate
optimized code for case A is to add one more optimization case in
"eliminate_regs_in_insn".

currently we only optimize "eliminate_reg + const_offset"

      if (plus_src                                                              
          && CONST_INT_P (XEXP (plus_src, 1))) 

while for those arch, like AArch64, Mips, ARM, which support base + offset
addressing mode, the following pattern which is normally for array element
address, like A[I]:

reg T <- eliminate_reg + reg I (which hold value I)
reg D <- MEM(reg T, offset)

we should eliminate into (fold two constant offset immediately):

reg T <- reg_after_eliminate + reg I
reg D <- MEM(reg T, offset + eliminate_offset)

instead of 

reg S <- reg_after_eliminate + eliminate_offset
reg T <- reg S + reg I
reg D <- MEM(reg T, offset)

because there are high dependence between D and T, we just need to check
NEXT_INSN when doing the elimination to detect whether there are such pattern.

I'd try this approach which is quite clean, hopefully AArch64, ARM, MIPS could
all be fixed.

[Bug target/62173] [5.0 regression] [AArch64] Performance regression due to r213488

Reply via email to