https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105923
--- Comment #7 from H.J. Lu <hjl.tools at gmail dot com> --- (In reply to Hongtao.liu from comment #6) > > .L3: > > subl %r13d, %r12d > > cmpl $1, %r12d > > je .L6 > > salq $4, %r13 > > vmovapd a(%r13), %xmm0 > > call _ZGVbN2v_foo > > vmovapd %xmm0, b(%r13) > > hmm, xmm version should be abandoned since it's just 1 complex double. If there is an XMM version of a vector complex, will it be faster than passing complex double as a struct in 2 registers.