https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31667

--- Comment #6 from Allan Jensen <linux at carewolf dot com> ---
(In reply to Andrew Pinski from comment #5)
> We produce this now:
> 
>         movdqa  x(%rip), %xmm1
>         pxor    %xmm0, %xmm0
>         movdqa  %xmm1, %xmm2
>         punpckhbw       %xmm0, %xmm1
>         movaps  %xmm1, y+16(%rip)
>         movdqa  x+16(%rip), %xmm1
>         punpcklbw       %xmm0, %xmm2
>         movaps  %xmm2, y(%rip)
>         movdqa  %xmm1, %xmm2
>         punpckhbw       %xmm0, %xmm1
>         movaps  %xmm1, y+48(%rip)
>         movdqa  x+32(%rip), %xmm1
>         punpcklbw       %xmm0, %xmm2
>         movaps  %xmm2, y+32(%rip)
>         movdqa  %xmm1, %xmm2
>         punpckhbw       %xmm0, %xmm1
>         movaps  %xmm1, y+80(%rip)
>         movdqa  x+48(%rip), %xmm1
>         punpcklbw       %xmm0, %xmm2
>         movaps  %xmm2, y+64(%rip)
>         movdqa  %xmm1, %xmm2
>         punpckhbw       %xmm0, %xmm1
>         punpcklbw       %xmm0, %xmm2
>         movaps  %xmm1, y+112(%rip)
>         movaps  %xmm2, y+96(%rip)
> 
> And even ICC produce a similar thing except scheduled differently.

I hope that is because you forgot -msse4.1?

Reply via email to