------- Comment #7 from dominiq at lps dot ens dot fr  2009-02-01 10:37 -------
Created an attachment (id=17220)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17220&action=view)
testin complex matrix multiplication

Comment #0 is not fully accurate. With some more testsing with the 
attached code, I get:
- gcc 4.3.3: no vectorization,
- gcc 4.4.0 (trunk) : vectorization for odd n,
- gcc 4.4.0 + patch from 
  http://gcc.gnu.org/ml/gcc-patches/2009-01/msg01271.html:
  vectorization for all values of n (in the tested range).

The attached code also checked the result of the matrix product which is
OK. Now as shown below (in flops/clock cycle), the timings are quite
disapointing (-m64 -O3 -ffast-math -funroll-loops): for odd n, the
vectorized code is slower than the nonvectorized one, for even n, the code
is faster with vectorization, but still significantly slower than with
ifort.

 n     4.3.3       trunk       trunk      ifort
                              +patch       11.0

124     1.33        1.36        1.81        2.61
125     1.37        1.32        1.32        2.20
126     1.36        1.37        1.79        2.55
127     1.37        1.31        1.31        2.22
128     1.38        1.39        1.86        2.64


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38968

Reply via email to