https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91981
--- Comment #6 from Segher Boessenkool <segher at gcc dot gnu.org> --- Attempting shrink-wrapping optimization. Block 2 needs the prologue. (That's the entry block, already). And in fact it does need the prologue, it has movq %rdi, %rbx # 2 [c=4 l=3] *movdi_internal/3 This was already decided by IRA: (insn 2 87 3 2 (set (reg/v/f:DI 105 [ v ]) (reg:DI 115)) "91981.c":46:30 66 {*movdi_internal} (expr_list:REG_DEAD (reg:DI 115) (nil))) and IRA picked 16:r82 l0 1 17:r83 l0 0 8:r88 l0 1 6:r89 l0 6 12:r92 l0 40 4:r93 l0 41 5:r94 l0 40 7:r95 l0 5 27:r97 l0 2 2:r100 l0 6 10:r103 l0 0 9:r104 l0 0 0:r105 l0 3 18:r106 l0 0 15:r107 l0 1 14:r109 l0 1 13:r111 l0 0 3:r113 l0 40 1:r114 l0 6 19:r115 l0 5 11:r116 l0 0 (105 gets bx, 115 gets di). Ideally IRA will choose register better, not use non-volatile registers early in the function. But shrink-wrapping could try to correct for that; that has been on my to-do for a long time now, but it is hard to come up with good heuristics. There are three mechanisms that can be used: 1) Rename registers. Sometimes you can shuffle the registers a bit such that the one you care about gets a volatile register. 2) More feasible, you can create register copies to move the stuff around. Sometimes late passes can get rid of those copies, even. 3) You can copy the code using those non-volatile registers to all successor blocks. Or just the code that sets the register. And you have to be careful that the inputs to the code you copy are still live at the new position(s), etc. Often you cannot get rid of *all* non-volatile registers, even in the entry block. Deciding which to get rid of, where, and how, is quite a big problem. But maybe there is some simple heuristic that works well that I just fail to see :-)