http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52572

--- Comment #3 from Marc Glisse <marc.glisse at normalesup dot org> 2012-03-13 
17:57:58 UTC ---
Or for this variant:
__m256d f(__m256d *y){
  __m256d x=*y;
  x[0]=0; // or x[3]
  return x;
}
it looks like vmaskmovpd could replace:
    vmovapd    (%rdi), %ymm0
    vmovapd    %xmm0, %xmm1
    vmovlpd    .LC0(%rip), %xmm1, %xmm1
    vinsertf128    $0x0, %xmm1, %ymm0, %ymm0
(I tried a version with __builtin_shuffle but it wouldn't generate vmaskmovpd
either)

(sorry for the naive suggestions, there are too many possibilities to optimize
them all...)

Reply via email to