https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103905

--- Comment #3 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Martin Liška from comment #1)
> Created attachment 52120 [details]
> Isolated test-case
> 
> Isolated test-case where only the miscompiled function
> ix86_expand_vec_extract_even_odd uses -O3.
> 
> @Uros: Can you please compare -fdump-tree-optimized before and after the
> revision?

When compiled with -O2 -march=bdver1, there are indeed a bunch of suspicious
XOP vpperm instructions in the function:

        vmovd   12(%rsp), %xmm6 # 303   [c=9 l=6]  *movsi_internal/10
        vpperm  %xmm3, %xmm0, %xmm1, %xmm0      # 124   [c=4 l=5]  mmx_ppermv64
        vpaddd  %xmm4, %xmm1, %xmm1     # 129   [c=8 l=4]  *mmx_addv2si3/2
        vpperm  %xmm3, %xmm1, %xmm2, %xmm1      # 131   [c=4 l=5]  mmx_ppermv64
        vpperm  .LC165(%rip), %xmm1, %xmm0, %xmm0       # 134   [c=13 l=9] 
mmx_ppermv64
        vpaddb  %xmm0, %xmm0, %xmm0     # 137   [c=8 l=4]  *mmx_addv8qi3/2
        vpshuflw        $0, %xmm6, %xmm1        # 140   [c=8 l=5] 
*vec_dupv4hi/1
        vpaddb  %xmm1, %xmm0, %xmm0     # 142   [c=8 l=4]  *mmx_addv8qi3/2
        vmovq   %xmm0, 32(%rsp,%rdi)    # 143   [c=4 l=6]  *movv8qi_internal/14
        je      .L4198  # 150   [c=12 l=2]  *jcc

I was not able to test them on my target, so I bet these are the problem.

Reply via email to