Following testcase: --cut here-- typedef long long __m128i __attribute__ ((__vector_size__ (16), __may_alias__)); typedef long long __v2di __attribute__ ((__vector_size__ (16)));
__m128i _mm_set_epi64x (long long __q1, long long __q0) { return __extension__ (__m128i)(__v2di){ __q0, __q1 }; } --cut here-- compiles using -O2 -msse2 into: _mm_set_epi64x: movq 12(%esp), %xmm1 movhps 4(%esp), %xmm1 (1) movdqa %xmm1, %xmm0 ret insn (1) is not needed. Before reload, we have: (insn:HI 12 7 18 2 epi.c:8 (set (reg/i:V2DI 21 xmm0 [ <result> ]) (vec_concat:V2DI (mem/c/i:DI (plus:SI (reg/f:SI 16 argp) (const_int 8 [0x8])) [2 __q0+0 S8 A32]) (mem/c/i:DI (reg/f:SI 16 argp) [2 __q1+0 S8 A32]))) 1039 {vec_concatv2di} (nil)) (insn:HI 18 12 0 2 epi.c:8 (use (reg/i:V2DI 21 xmm0 [ <result> ])) -1 (nil)) For some reason reload isn't satisfied with %xmm0: Reload 0: reload_in (DI) = (mem/c/i:DI (plus:SI (reg/f:SI 7 sp) (const_int 12 [0xc])) [2 __q0+0 S8 A32]) reload_out (V2DI) = (reg/i:V2DI 21 xmm0 [ <result> ]) SSE_REGS, RELOAD_OTHER (opnum = 0) reload_in_reg: (mem/c/i:DI (plus:SI (reg/f:SI 7 sp) (const_int 12 [0xc])) [2 __q0+0 S8 A32]) reload_out_reg: (reg/i:V2DI 21 xmm0 [ <result> ]) reload_reg_rtx: (reg:V2DI 22 xmm1) The insn pattern is: [(set (match_operand:V2DI 0 "register_operand" "=Y2,?Y2,Y2,x,x,x") (vec_concat:V2DI (match_operand:DI 1 "nonimmediate_operand" " m,*y ,0 ,0,0,m") (match_operand:DI 2 "vector_move_operand" " C, C,Y2,x,m,0")))] So, reload should take into account that operand 1 should match operand 0, so %xmm0 should be used. -- Summary: Non-optimal reload register used Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: ubizjak at gmail dot com GCC build triplet: i686-pc-linux-gnu GCC host triplet: i686-pc-linux-gnu GCC target triplet: i686-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34283