------- Comment #7 from dominiq at lps dot ens dot fr 2009-02-01 10:37 ------- Created an attachment (id=17220) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17220&action=view) testin complex matrix multiplication
Comment #0 is not fully accurate. With some more testsing with the attached code, I get: - gcc 4.3.3: no vectorization, - gcc 4.4.0 (trunk) : vectorization for odd n, - gcc 4.4.0 + patch from http://gcc.gnu.org/ml/gcc-patches/2009-01/msg01271.html: vectorization for all values of n (in the tested range). The attached code also checked the result of the matrix product which is OK. Now as shown below (in flops/clock cycle), the timings are quite disapointing (-m64 -O3 -ffast-math -funroll-loops): for odd n, the vectorized code is slower than the nonvectorized one, for even n, the code is faster with vectorization, but still significantly slower than with ifort. n 4.3.3 trunk trunk ifort +patch 11.0 124 1.33 1.36 1.81 2.61 125 1.37 1.32 1.32 2.20 126 1.36 1.37 1.79 2.55 127 1.37 1.31 1.31 2.22 128 1.38 1.39 1.86 2.64 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38968