https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118611
Richard Sandiford <rsandifo at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Last reconfirmed| |2025-01-24 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 CC| |rsandifo at gcc dot gnu.org --- Comment #4 from Richard Sandiford <rsandifo at gcc dot gnu.org> --- I think the problem is IRA rather than LRA. As a result of the quoted instruction, IRA realises that r101 and r103 should be tied. It therefore forms a thread for them: Forming thread by copy 3:a1r101-a3r103 (freq=1000): Result (freq=4000): a1r101(2000) a3r103(2000) But it happens to allocate r101 first, even though r103's allocation is more constrained: Popping a7(r106,l0) -- assign reg 63 Popping a1(r101,l0) -- assign reg 63 Popping a3(r103,l0) -- assign reg 62 So IRA picks 63 for both r101 and r106. But r103 and r106 are live at the same time, so it has to fall back on 62 for r103. I don't think allocating r101 and then r103 is necessarily the wrong order. There could be other cases where the current order gives the best result and the opposite order wouldn't. Instead, it seems like the cost of allocating 63 to r101 doesn't fully reflect the r103→r101 copy that we would fail to eliminate (or, alternatively, that the cost of allocating 62 doesn't fully reflect the saving of eliminating the copy).