------- Comment #22 from victork at gcc dot gnu dot org 2008-02-10 15:06
-------
1. It looks like vectorizer was enabled in both cases, since -O3 enables the
vectorizer by the default. You need to add -fno-tree-vectorize to disable it
explicitly.
2. To get better results from vectorized version I would recommend to allocate
arrays at boundaries aligned to 16 byte and let to the compiler to know this.
You can do it by static allocation of arrays:
float pSum1[64000] __attribute__ ((__aligned__(16)));
float pSum[64000] __attribute__ ((__aligned__(16)));
float pVec1[64000] __attribute__ ((__aligned__(16)));
3. It would be better if "itBegin" will start from 0 and be known at compile
time. This and [2] will allow to vectorizer to save realigning loads.
4. For some strange reason the run time of this test can vary significantly (up
to 50%) from run to run. So be sure to run it several times.
-- Victor.
--
victork at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |victork at gcc dot gnu dot
| |org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35117