On x86_64, the following function (extracted from gmp, the asms are from its longlong.h)
void mul_basecase (unsigned long * wp, unsigned long * up, long un, unsigned long * vp, long vn) { long j; unsigned long prod_low, prod_high; unsigned long cy_dig; unsigned long v_limb; v_limb = vp[0]; cy_dig = 0; for (j = un; j > 0; j--) { unsigned long u_limb, w_limb; u_limb = *up++; __asm__ ("mulq %3" : "=a" (prod_low), "=d" (prod_high) : "%0" (u_limb), "rm" (v_limb)); __asm__ ("addq %5,%q1\n" "\tadcq %3,%q0" : "=r" (cy_dig), "=&r" (w_limb) : "0" (prod_high), "rme" (0), "%1" (prod_low), "rme" (cy_dig)); *wp++ = w_limb; } } produces with optimization: #APP # 16 "t.c" 1 mulq %rcx # 0 "" 2 # 18 "t.c" 1 addq %rdx,%rax adcq $0,%rdx # 0 "" 2 #NO_APP where you can see that cy_dig is allocated to the upper half of the mulq result. It looks like that we are confused by rest_of_match_asm_constraints which inserts (insn 44 25 26 4 t.c:18 (set (reg/v:DI 67 [ cy_dig ]) (reg/v:DI 68 [ prod_high ])) -1 (nil)) after the mulq asm. -- Summary: wrong code for multiple output asm, wrong df? Product: gcc Version: 4.3.0 Status: UNCONFIRMED Keywords: wrong-code Severity: major Priority: P3 Component: rtl-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: rguenth at gcc dot gnu dot org GCC target triplet: x86_64-*-* http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33552