http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52572
--- Comment #3 from Marc Glisse <marc.glisse at normalesup dot org> 2012-03-13 17:57:58 UTC --- Or for this variant: __m256d f(__m256d *y){ __m256d x=*y; x[0]=0; // or x[3] return x; } it looks like vmaskmovpd could replace: vmovapd (%rdi), %ymm0 vmovapd %xmm0, %xmm1 vmovlpd .LC0(%rip), %xmm1, %xmm1 vinsertf128 $0x0, %xmm1, %ymm0, %ymm0 (I tried a version with __builtin_shuffle but it wouldn't generate vmaskmovpd either) (sorry for the naive suggestions, there are too many possibilities to optimize them all...)