------- Comment #3 from sfilippone at uniroma2 dot it 2009-11-19 17:17 ------- (In reply to comment #2) > -ftree-vectorizer-verbose=2 tells you: > > eval.f90:35: note: not vectorized: relevant stmt not supported: D.1684_73 = > ((D.1683_72)); > > eval.f90:32: note: not vectorized: relevant stmt not supported: D.1684_58 = > ((D.1683_57)); > > PAREN_EXPRs are new in 4.4 and I believe they cannot be turned off > right now. > > The loops are > > do i=1,nnd > x(i) = 1.d0 + (1.d0*i)/nnd > end do > do i=1,n > foo4(i) = 1.d0 + (1.d0*i)/n > end do > > where the vectorizer doesn't know how to ensure evaluation order is > preserved when trying to vectorize (1.d0*i)/n. Writing them as > 1.d0*i/n vectorizes the function. > > Still the performance is lower by a factor of two compared to 4.3 > (even with -ffast-math). > > Probably the bug should be split. >
Well, the performance drop I am looking at is in the subroutine. The initialization loops are (to me) irrelevant, I had posted a previous version to the mailing list where the initialization was done with random_number and the situation was the same. A run with profiling shows that more than 99% of the time is spent in eval_ -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42108