Improve offset combination during LRA virtual register elimination?

Jiong Wang Wed, 22 Apr 2015 05:10:42 -0700

During investigate PR62173, another issue I found is gcc is doing bad
offset combination during LRA virtual register elimination.


For example, suppose we access one element from a local array A[i],
normally we get:

sequence1
=========
  rA = sfp + rB
  rC = MEM[rA + off0]

rB contains index "i", off0 is the stack offset of this local array.

LRA virtual register elimination then eliminate sfp to "hard_fp + off1",
so the final insn sequences becomes:

sequence2
=========
  rT = hard_fp + off1
  rA = rT + rB
  rC = MEM [rA + off0]

off1 and off0 can be combined into one offset while LRA haven't.

I have modified TARGET_LEGITIMIZE_ADRESS to legitimize sequence1 into:

sequence3
=========
  rA = sfp + off0
  rC = MEM[rA + rB]

As LRA do combine constants for a simple "sfp + const", it's eliminated
into sequence4, no extra instruction introduced.

sequence4
=========
  rA = hard_fp + off3 (off3 = off0 + off1)
  rC = MEM[rA + rB]

But the problem is as #comment 8 in PR62173, it's not always good to
generate sequences3, as it's not friendly to CSE, it increases register 
pressure.

For example, suppose we have three local arries, A[i], B[i], C[i]. Then
the instruction sequences will be:

  rA0 = sfp + off0
  rC0 = MEM[rA0 + rB]
  rA1 = sfp + off1
  rC1 = MEM[rA1 + rB]
  rA2 = sfp + off2
  rC2 = MEM[rA2 + rB]

While the old one will be:
  rA0 = sfp + RB
  rC0 = MEM[rA0 + off0]
  rA1 = sfp + RB
  rC1 = MEM[rA1 + off1]
  rA2 = sfp + RB
  rC2 = MEM[rA2 + off2]

sfp + RB will be CSEd, thus lower register pressure.

IMO, if such instruction sequences (which is quite normal for RISC)
occur in loop, then we should always legitimize them into the format of
sequence3, as it will facilitate constant combination during LRA virtual
register elimination and create one more loop iv as side effect, thus
normally save two instructions in the loop.

While if such instruction sequences do not occur in loop, any thoughts
how to teach LRA to combine the two constants in sequence2? 

(I have filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64082)

-- 
Regards,
Jiong

Improve offset combination during LRA virtual register elimination?

Reply via email to