Using a gcc build from today I tried one of the autovectorization test cases
and found it over optimized:
$ ~/toolchain/install/bin/gcc -v
Using built-in specs.
Target: powerpc-linux
Configured with: /home/anton/toolchain/gcc/gcc/configure --build=powerpc-linux
--host=powerpc-linux --target=powerpc-linux --enable-targets=powerpc64-linux
--enable-languages=c,c++,fortran --prefix=/home/anton/toolchain/install
Thread model: posix
gcc version 4.2.0 20060215 (experimental)
$ cat example1.c
int a[256], b[256], c[256];
foo () {
int i;
for (i=0; i<256; i++){
a[i] = b[i] + c[i];
}
}
$ ~/toolchain/install/bin/gcc -c -O1 -ftree-vectorize
-ftree-vectorizer-verbose=5 -maltivec -o example1_vect.o example1.c
example1.c:5: note: LOOP VECTORIZED.
example1.c:5: note: vectorized 1 loops in function.
$ objdump -d example1_vect.o
example1_vect.o: file format elf32-powerpc
Disassembly of section .text:
00000000 <foo>:
0: 94 21 ff f0 stwu r1,-16(r1)
4: 38 00 00 40 li r0,64
8: 7c 09 03 a6 mtctr r0
c: 42 00 00 00 bdnz- c <foo+0xc>
10: 38 21 00 10 addi r1,r1,16
14: 4e 80 00 20 blr
--
Summary: Over optimization of loop when using -ftree-vectorize
Product: gcc
Version: 4.2.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: anton at samba dot org
GCC target triplet: powerpc64-linux
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26359