https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65832
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- typedef int v4si __attribute__((vector_size(16))); v4si bar (int *i, int *j, int *k, int *l) { return (v4si) { *i, *j, *k, *l }; } looks reasonable (no spills at least, stray move for the return value). movd (%rsi), %xmm0 movd (%rdi), %xmm3 movd (%rcx), %xmm1 movd (%rdx), %xmm2 punpckldq %xmm0, %xmm3 punpckldq %xmm1, %xmm2 movdqa %xmm3, %xmm0 punpcklqdq %xmm2, %xmm0 With -mavx2 we get vmovd (%rdx), %xmm2 vmovd (%rdi), %xmm3 vpinsrd $1, (%rcx), %xmm2, %xmm1 vpinsrd $1, (%rsi), %xmm3, %xmm0 vpunpcklqdq %xmm1, %xmm0, %xmm0