The modifies testcase from PR18767 shows the problem where loop count variables
still remains in vectorized loop.
Compiling the modified testcase with 'g++ -O2 -march=pentium4 -ftree-vectorize'
following code is produced for the first loop:
...
leal -24(%ebp), %esi
leal -40(%ebp), %ebx
leal -56(%ebp), %ecx
xorl %eax, %eax
xorl %edx, %edx
.L2:
addl $1, %eax
movaps (%edx,%esi), %xmm0
mulps (%ebx,%edx), %xmm0
movaps %xmm0, (%edx,%ecx)
addl $16, %edx
cmpl $1, %eax
jne .L2
...
It looks that the compiler does not figure out that the conditional jump is
never taken.
However with 'g++ -O2 -march=pentium4 -ftree-vectorize -funroll-loops' generated
code is a lot better:
...
movaps -24(%ebp), %xmm0
mulps -40(%ebp), %xmm0
movaps %xmm0, -56(%ebp)
...
Uros.
--
Summary: Redundant loop count insns in simple vectorized loop
Product: gcc
Version: 4.0.0
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: uros at kss-loka dot si
CC: gcc-bugs at gcc dot gnu dot org
GCC build triplet: i686-pc-linux-gnu
GCC host triplet: i686-pc-linux-gnu
GCC target triplet: i686-pc-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18777