During investigate PR62173, another issue I found is gcc is doing bad offset combination during LRA virtual register elimination.
For example, suppose we access one element from a local array A[i], normally we get: sequence1 ========= rA = sfp + rB rC = MEM[rA + off0] rB contains index "i", off0 is the stack offset of this local array. LRA virtual register elimination then eliminate sfp to "hard_fp + off1", so the final insn sequences becomes: sequence2 ========= rT = hard_fp + off1 rA = rT + rB rC = MEM [rA + off0] off1 and off0 can be combined into one offset while LRA haven't. I have modified TARGET_LEGITIMIZE_ADRESS to legitimize sequence1 into: sequence3 ========= rA = sfp + off0 rC = MEM[rA + rB] As LRA do combine constants for a simple "sfp + const", it's eliminated into sequence4, no extra instruction introduced. sequence4 ========= rA = hard_fp + off3 (off3 = off0 + off1) rC = MEM[rA + rB] But the problem is as #comment 8 in PR62173, it's not always good to generate sequences3, as it's not friendly to CSE, it increases register pressure. For example, suppose we have three local arries, A[i], B[i], C[i]. Then the instruction sequences will be: rA0 = sfp + off0 rC0 = MEM[rA0 + rB] rA1 = sfp + off1 rC1 = MEM[rA1 + rB] rA2 = sfp + off2 rC2 = MEM[rA2 + rB] While the old one will be: rA0 = sfp + RB rC0 = MEM[rA0 + off0] rA1 = sfp + RB rC1 = MEM[rA1 + off1] rA2 = sfp + RB rC2 = MEM[rA2 + off2] sfp + RB will be CSEd, thus lower register pressure. IMO, if such instruction sequences (which is quite normal for RISC) occur in loop, then we should always legitimize them into the format of sequence3, as it will facilitate constant combination during LRA virtual register elimination and create one more loop iv as side effect, thus normally save two instructions in the loop. While if such instruction sequences do not occur in loop, any thoughts how to teach LRA to combine the two constants in sequence2? (I have filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64082) -- Regards, Jiong