http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46519

--- Comment #4 from H.J. Lu <hjl.tools at gmail dot com> 2010-11-18 01:48:13 
UTC ---
(In reply to comment #3)
> Created attachment 22437 [details]
> A testcase
> 
> With -O3 -funroll-loops -ffast-math -mavx:
> 
>         movl    Token_Id(%rip), %eax
>         vmovapd 32(%rsp), %ymm0
>         cmpl    $1, %eax
>         je      .L4
>         cmpl    $2, %eax
>         je      .L5
>         testl   %eax, %eax
>         jne     .L464
>         leaq    124(%rsp), %rsi
>         leaq    64(%rsp), %rdi
>         call    _Z16Parse_Rel_FactorPdPi   <<<< Missing vzeroupper
>         movl    124(%rsp), %esi
> 
> .L856:
>         vmovapd %ymm0, 32(%rsp)
>         call    _Z9Get_Tokenv  <<<< Missing vzeroupper
>         movl    Token_Id(%rip), %eax 
>         vmovapd 32(%rsp), %ymm0
> 
> ...
> .L491:
>         leaq    252(%rsp), %rsi
>         leaq    64(%rsp), %rdi 
>         vmovapd %ymm0, 32(%rsp)    
>         call    _ZL14Parse_Rel_TermPdPi   <<<< Missing vzeroupper
>         movl    252(%rsp), %esi
>         movl    224(%rsp), %edx
>         vmovapd 32(%rsp), %ymm0
>         cmpl    %edx, %esi

Some of 256bit vector insns are introduced by loop unroll. Maybe
we should drop the use_avx256_p check since it isn't reliable.

Reply via email to