------- Comment #5 from dominiq at lps dot ens dot fr 2008-05-13 15:27 ------- I just noticed today that the vectorization of the variant induct.v2.f90 depends on the -m64 flag:
[ibook-dhum] source/dir_indu% gfc -m64 -O3 -ffast-math -funroll-loops -ftree-vectorizer-verbose=2 indu.v2.f90 ... indu.v2.f90:2322: note: not vectorized: unsupported use in stmt. indu.v2.f90:2245: note: not vectorized: unsupported unaligned store. indu.v2.f90:2244: note: vectorizing stmts using SLP. indu.v2.f90:2244: note: LOOP VECTORIZED. indu.v2.f90:2146: note: not vectorized: unsupported use in stmt. indu.v2.f90:2069: note: not vectorized: unsupported unaligned store. indu.v2.f90:2068: note: vectorizing stmts using SLP. indu.v2.f90:2068: note: LOOP VECTORIZED. indu.v2.f90:1976: note: not vectorized: complicated access pattern. indu.v2.f90:1875: note: vectorized 2 loops in function. indu.v2.f90:1816: note: not vectorized: unsupported use in stmt. indu.v2.f90:1771: note: not vectorized: unsupported unaligned store. indu.v2.f90:1770: note: vectorizing stmts using SLP. indu.v2.f90:1770: note: LOOP VECTORIZED. indu.v2.f90:1682: note: not vectorized: unsupported use in stmt. indu.v2.f90:1633: note: not vectorized: unsupported unaligned store. indu.v2.f90:1632: note: vectorizing stmts using SLP. indu.v2.f90:1632: note: LOOP VECTORIZED. indu.v2.f90:1543: note: not vectorized: complicated access pattern. indu.v2.f90:1441: note: vectorized 2 loops in function. ... [ibook-dhum] source/dir_indu% gfc -O3 -ffast-math -funroll-loops -ftree-vectorizer-verbose=2 indu.v2.f90 ... indu.v2.f90:2334: note: LOOP VECTORIZED. indu.v2.f90:2245: note: not vectorized: unsupported unaligned store. indu.v2.f90:2244: note: vectorizing stmts using SLP. indu.v2.f90:2244: note: LOOP VECTORIZED. indu.v2.f90:2158: note: LOOP VECTORIZED. indu.v2.f90:2069: note: not vectorized: unsupported unaligned store. indu.v2.f90:2068: note: vectorizing stmts using SLP. indu.v2.f90:2068: note: LOOP VECTORIZED. indu.v2.f90:1976: note: not vectorized: complicated access pattern. indu.v2.f90:1875: note: vectorized 4 loops in function. indu.v2.f90:1825: note: LOOP VECTORIZED. indu.v2.f90:1771: note: not vectorized: unsupported unaligned store. indu.v2.f90:1770: note: vectorizing stmts using SLP. indu.v2.f90:1770: note: LOOP VECTORIZED. indu.v2.f90:1691: note: LOOP VECTORIZED. indu.v2.f90:1633: note: not vectorized: unsupported unaligned store. indu.v2.f90:1632: note: vectorizing stmts using SLP. indu.v2.f90:1632: note: LOOP VECTORIZED. indu.v2.f90:1543: note: not vectorized: complicated access pattern. indu.v2.f90:1441: note: vectorized 4 loops in function. ... Where the nested loop vectorized without -m64 at 1691 is: ... do j = 1, 9 c_vector(3) = 0.5_longreal * h_coil * z1gauss(j) ! ! rotate coil vector into the global coordinate system and translate it ! rot_c_vector(1) = rot_i_vector(1) + rotate_coil(1,3) * c_vector(3) rot_c_vector(2) = rot_i_vector(2) + rotate_coil(2,3) * c_vector(3) rot_c_vector(3) = rot_i_vector(3) + rotate_coil(3,3) * c_vector(3) ! do k = 1, 9 ! <==== line 1691 ! ! rotate quad vector into the global coordinate system ! rot_q_vector(1) = rot_q1_vector(k,1) - rot_c_vector(1) rot_q_vector(2) = rot_q1_vector(k,2) - rot_c_vector(2) rot_q_vector(3) = rot_q1_vector(k,3) - rot_c_vector(3) ! ! compute and add in quadrature term ! numerator = dotp * w1gauss(j) * w2gauss(k) dotp2=rot_q_vector(1)*rot_q_vector(1)+rot_q_vector(2)*rot_q_vector(2)+ & rot_q_vector(3)*rot_q_vector(3) denominator = sqrt(dotp2) l12_lower = l12_lower + numerator/denominator end do end do ... -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36099