http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51499
--- Comment #4 from fb.programming at gmail dot com 2011-12-11 11:52:30 UTC ---
Looks like there has been some great progress in gcc 4.7!
Still I think it behaves slightly buggy.
(1) In this case it should work without -funsafe-math-optimizations but
it doesn't. gcc 4.7 requires -fno-signed-zeros -fno-trapping-math
-fassociative-math to make it work.
(2) The prediction:
7: not vectorized: vectorization not profitable.
is just wrong. Forcing it with -fno-vect-cost-model shows it speeds up
by factor of 2.
(3) If I change all double's into float's in the code above it seems to
work without forcing it (-fno-vect-cost-model):
g++-4.7 -S -Wall -O2 -ftree-vectorize -ftree-vectorizer-verbose=2 \
-funsafe-math-optimizations test.cpp
Analyzing loop at test.cpp:7
Vectorizing loop at test.cpp:7
7: vectorizing stmts using SLP.
7: LOOP VECTORIZED.
test.cpp:4: note: vectorized 1 loops in function.
However, it hasn't vectorized it at all as the assembly shows:
.L11:
addq $1, %rax
addss %xmm0, %xmm3
cmpq %rax, %rdi
addss %xmm0, %xmm4
addss %xmm0, %xmm7
addss %xmm0, %xmm6
addss %xmm0, %xmm5
addss %xmm0, %xmm1
ja .L11