https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62173

--- Comment #10 from Jiong Wang <jiwang at gcc dot gnu.org> ---
Finished a further investigation, looks like the simplest fix to genrate
optimized code for case A is to add one more optimization case in
"eliminate_regs_in_insn".

currently we only optimize "eliminate_reg + const_offset"

      if (plus_src                                                              
          && CONST_INT_P (XEXP (plus_src, 1))) 

while for those arch, like AArch64, Mips, ARM, which support base + offset
addressing mode, the following pattern which is normally for array element
address, like A[I]:

reg T <- eliminate_reg + reg I (which hold value I)
reg D <- MEM(reg T, offset)

we should eliminate into (fold two constant offset immediately):

reg T <- reg_after_eliminate + reg I
reg D <- MEM(reg T, offset + eliminate_offset)

instead of 

reg S <- reg_after_eliminate + eliminate_offset
reg T <- reg S + reg I
reg D <- MEM(reg T, offset)

because there are high dependence between D and T, we just need to check
NEXT_INSN when doing the elimination to detect whether there are such pattern.

I'd try this approach which is quite clean, hopefully AArch64, ARM, MIPS could
all be fixed.

Reply via email to