http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49095
Richard Guenther <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Target| |x86_64-*-*, i?86-*-* Status|UNCONFIRMED |NEW Keywords| |missed-optimization Last reconfirmed| |2011.05.21 09:52:49 Component|other |rtl-optimization Ever Confirmed|0 |1 --- Comment #1 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-05-21 09:52:49 UTC --- Confirmed (not using decq is because it is slower for some archs). On the tree level we cannot do better than D.2722_2 = *argv_1(D); D.2723_3 = D.2722_2 + -1; *argv_1(D) = D.2723_3; if (D.2723_3 == 0B) because we lack anything like direct operations on memory (and that's good). On the RTL side combine tries to do Trying 7, 8 -> 9: Failed to match this instruction: (parallel [ (set (mem/f:DI (reg/v/f:DI 63 [ argv ]) [2 *argv_1(D)+0 S8 A64]) (plus:DI (mem/f:DI (reg/v/f:DI 63 [ argv ]) [2 *argv_1(D)+0 S8 A64]) (const_int -1 [0xffffffffffffffff]))) (set (reg/f:DI 60 [ D.2723 ]) (plus:DI (mem/f:DI (reg/v/f:DI 63 [ argv ]) [2 *argv_1(D)+0 S8 A64]) (const_int -1 [0xffffffffffffffff]))) ]) because we have a use of the decrement result in the comparison. It doesn't try to combine this with the comparison though. So this case is really special ;) Without the use of the decremented value we get the desired subq $1, (%rsi). Manually sinking the store to *argv into the if and the else yields movq (%rsi), %rbx subq $1, %rbx je L4 L2: movq %rbx, (%rsi) ... L4: LCFI4: movq $0, (%rsi) movq %rsi, %rdi movq %rsi, 8(%rsp) call _fncall so at least we then can combine the decrement with the test ... As usual combine doesn't like stores.