http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47271
Summary: gcc-4.6 -O1 -ftree-vectorize removes a test (if), the function generates invalid outputs Product: gcc Version: 4.6.0 Status: UNCONFIRMED Severity: critical Priority: P3 Component: c AssignedTo: unassig...@gcc.gnu.org ReportedBy: victor.stin...@haypocalc.com I tried to compile Python 3.2 (r87949) with gcc (version 4.6.0 20100908) on AMD64: Python does fail with an assertion error or another strange crash. The problem comes from a loop in Python/peephole.c. Compiled with -O1, it works fine. Compiled with -O1 -ftree-vectorize, the functions generates strange (invalid) outputs. gcc-4.6 -O1: ------------------------------------- 0x0000000000480991 <+5041>: mov %eax,%edx 0x0000000000480993 <+5043>: sub %esi,%edx 0x0000000000480995 <+5045>: mov %edx,(%r12,%rax,4) 0x0000000000480999 <+5049>: movzbl 0x0(%rbp,%rax,1),%edx 0x000000000048099e <+5054>: cmp $0x9,%dl 0x00000000004809a1 <+5057>: jne 0x4809a8 <PyCode_Optimize+5064> 0x00000000004809a3 <+5059>: add $0x1,%esi 0x00000000004809a6 <+5062>: jmp 0x4809b2 <PyCode_Optimize+5074> 0x00000000004809a8 <+5064>: mov $0x3,%ecx 0x00000000004809ad <+5069>: cmp $0x59,%dl 0x00000000004809b0 <+5072>: ja 0x4809b7 <PyCode_Optimize+5079> 0x00000000004809b2 <+5074>: mov $0x1,%ecx 0x00000000004809b7 <+5079>: add %rcx,%rax 0x00000000004809ba <+5082>: cmp %rax,%rdi 0x00000000004809bd <+5085>: jg 0x480991 <PyCode_Optimize+5041> ------------------------------------- gcc-4.6 -O1 -ftree-vectorize ------------------------------------- 0x0000000000480991 <+5041>: mov %eax,%ecx 0x0000000000480993 <+5043>: sub %edx,%ecx 0x0000000000480995 <+5045>: mov %ecx,(%r12,%rax,4) 0x0000000000480999 <+5049>: movzbl 0x0(%rbp,%rax,1),%ecx 0x000000000048099e <+5054>: lea 0x1(%rdx),%esi 0x00000000004809a1 <+5057>: cmp $0x9,%cl 0x00000000004809a4 <+5060>: cmovne %edx,%esi 0x00000000004809a7 <+5063>: cmove %esi,%edx 0x00000000004809aa <+5066>: setne %cl 0x00000000004809ad <+5069>: movzbl %cl,%ecx 0x00000000004809b0 <+5072>: lea 0x1(%rax,%rcx,2),%rax 0x00000000004809b5 <+5077>: cmp %rax,%rdi 0x00000000004809b8 <+5080>: jg 0x480991 <PyCode_Optimize+5041> ------------------------------------- Extract of the correct output (-O1): ---- addrmap[0]=0 addrmap[3]=3 addrmap[4]=4 addrmap[7]=7 addrmap[10]=10 addrmap[13]=13 addrmap[16]=16 addrmap[19]=19 addrmap[22]=22 addrmap[23]=22 ---- With -O1 -ftree-vectorize, only addrmap[0] and addrmap[3] are correct: ---- addrmap[0]=0 addrmap[3]=3 addrmap[4]=0 addrmap[7]=32767 addrmap[10]=16777216 addrmap[13]=0 addrmap[16]=469314288 addrmap[19]=32767 addrmap[22]=469315151 addrmap[23]=32767 ---- See also: http://bugs.python.org/issue9880 My setup: * Intel(R) Pentium(R) 4 CPU 3.00GHz * Debian Sid * gcc (Debian 20110106-1) 4.6.0 20110106 (experimental) [trunk revision 168538] * Python 3.2 (r87949)