http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46519
--- Comment #4 from H.J. Lu <hjl.tools at gmail dot com> 2010-11-18 01:48:13 UTC --- (In reply to comment #3) > Created attachment 22437 [details] > A testcase > > With -O3 -funroll-loops -ffast-math -mavx: > > movl Token_Id(%rip), %eax > vmovapd 32(%rsp), %ymm0 > cmpl $1, %eax > je .L4 > cmpl $2, %eax > je .L5 > testl %eax, %eax > jne .L464 > leaq 124(%rsp), %rsi > leaq 64(%rsp), %rdi > call _Z16Parse_Rel_FactorPdPi <<<< Missing vzeroupper > movl 124(%rsp), %esi > > .L856: > vmovapd %ymm0, 32(%rsp) > call _Z9Get_Tokenv <<<< Missing vzeroupper > movl Token_Id(%rip), %eax > vmovapd 32(%rsp), %ymm0 > > ... > .L491: > leaq 252(%rsp), %rsi > leaq 64(%rsp), %rdi > vmovapd %ymm0, 32(%rsp) > call _ZL14Parse_Rel_TermPdPi <<<< Missing vzeroupper > movl 252(%rsp), %esi > movl 224(%rsp), %edx > vmovapd 32(%rsp), %ymm0 > cmpl %edx, %esi Some of 256bit vector insns are introduced by loop unroll. Maybe we should drop the use_avx256_p check since it isn't reliable.