http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50107
--- Comment #14 from Vladimir Makarov <vmakarov at redhat dot com> 2011-08-19 16:12:48 UTC --- (In reply to comment #11) > (In reply to comment #10) > > > movq %rdi, %rdx > > > mulx %rsi, %rax, %rsi > > > movq %rsi, %rdx > > > ret > > > .cfi_endproc > > > .LFE0: > > > .size test_mul_64, .-test_mul_64 > > > .ident "GCC: (GNU) 4.7.0 20110818 (experimental)" > > > .section .note.GNU-stack,"",@progbits > > > [hjl@gnu-6 pr50107]$ > > > > > > I would expect > > > > > > movq %rdi, %rdx > > > mulx %rsi, %rax, %rdx > > > ret > > > > I think it i a reload problem. IRA assigns dx to pseudo 71 (an insn output) > > but reload then spills it. > > uti-2.i.188r.asmcons has > > (insn 11 4 24 2 (parallel [ > (set (reg:DI 72) > (mult:DI (reg/v:DI 64 [ b ]) > (reg/v:DI 63 [ a ]))) > (set (reg:DI 73 [+8 ]) > (truncate:DI (ashiftrt:TI (mult:TI (zero_extend:TI (reg/v:DI > 64 > [ b ])) > (zero_extend:TI (reg/v:DI 63 [ a ]))) > (const_int 64 [0x40])))) > ]) uti-2.i:3 339 {bmi2_mulxditi3_internal} > (expr_list:REG_DEAD (reg/v:DI 64 [ b ]) > (expr_list:REG_DEAD (reg/v:DI 63 [ a ]) > (nil)))) > > uti-2.i.191r.ira generates: > > (insn 11 28 25 2 (parallel [ > (set (reg:DI 0 ax [72]) > (mult:DI (reg/v:DI 4 si [orig:64 b ] [64]) > (reg:DI 1 dx))) > (set (reg:DI 4 si [orig:73+8 ] [73]) > (truncate:DI (ashiftrt:TI (mult:TI (zero_extend:TI (reg/v:DI 4 > s > i [orig:64 b ] [64])) > (zero_extend:TI (reg:DI 1 dx))) > (const_int 64 [0x40])))) > ]) uti-2.i:3 339 {bmi2_mulxditi3_internal} > (nil)) > > Why does IRA/reload choose SI for pseudo 73? IRA assigns dx to pseudo 73. Than reload pass needs dx for pseudo 63 and reload spills 73 and assigns si to 73 again. Reload pass spills pseudo 73 because it believes that pseudos living through insn or dead or set (pseudo 73 is set) in the insn conflict with necessary reload. Of course it is really not necessary to spill pseudo 73, but to teach reload pass to that is a big, error-prune project. I'd not recommend to start it. I myself am not interesting to work on the reload pass. Instead I prefer to work on LRA (local RA) which is a reload pass replacement.