------- Comment #22 from victork at gcc dot gnu dot org 2008-02-10 15:06 ------- 1. It looks like vectorizer was enabled in both cases, since -O3 enables the vectorizer by the default. You need to add -fno-tree-vectorize to disable it explicitly.
2. To get better results from vectorized version I would recommend to allocate arrays at boundaries aligned to 16 byte and let to the compiler to know this. You can do it by static allocation of arrays: float pSum1[64000] __attribute__ ((__aligned__(16))); float pSum[64000] __attribute__ ((__aligned__(16))); float pVec1[64000] __attribute__ ((__aligned__(16))); 3. It would be better if "itBegin" will start from 0 and be known at compile time. This and [2] will allow to vectorizer to save realigning loads. 4. For some strange reason the run time of this test can vary significantly (up to 50%) from run to run. So be sure to run it several times. -- Victor. -- victork at gcc dot gnu dot org changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |victork at gcc dot gnu dot | |org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35117