http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52865
Tobias Burnus <burnus at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |missed-optimization CC| |burnus at gcc dot gnu.org Component|fortran |tree-optimization --- Comment #3 from Tobias Burnus <burnus at gcc dot gnu.org> 2012-04-04 14:00:30 UTC --- (In reply to comment #2) > DOUBLE PRECISION Dx(*) , Dy(*) > and > double X[1000], Y[1000] > are not at all the same. But one still gets the same result if one uses: void daxpy(int m, int n, double X[], double Y[], double z) which should be close to what one gets with Fortran. * * * For the Fortran loop, -ftree-vectorizer-verbose=3 shows: 14: ===== analyze_loop_nest ===== 14: === vect_analyze_loop_form === 14: not vectorized: unexpected loop form. 14: bad loop form. For the C loop: 6: Profitability threshold is 2 loop iterations. 6: created 1 versioning for alias checks. 6: vectorizing stmts using SLP. 6: LOOP VECTORIZED. For the Fortran loop, using ifort 12.1: (15): (col. 19) remark: BLOCK WAS VECTORIZED. (14): (col. 16) remark: loop was not vectorized: not inner loop. Original dump for the Fortran loop (-fdump-tree-original): D.1862 = mp1; D.1863 = *n; i = D.1862; if (D.1863 < D.1862) goto L.2; countm1.0 = (unsigned int) (NON_LVALUE_EXPR <D.1863> - NON_LVALUE_EXPR <D.1862>) / 4; while (1) { (*dy)[(integer(kind=8)) i + -1] = (*dy)[(integer(kind=8)) i + -1] + *da * (*dx)[(integer(kind=8)) i + -1]; (*dy)[(integer(kind=8)) (i + 1) + -1] = (*dy)[(integer(kind=8)) (i + 1) + -1] + *da * (*dx)[(integer(kind=8)) (i + 1) + -1]; (*dy)[(integer(kind=8)) (i + 2) + -1] = (*dy)[(integer(kind=8)) (i + 2) + -1] + *da * (*dx)[(integer(kind=8)) (i + 2) + -1]; (*dy)[(integer(kind=8)) (i + 3) + -1] = (*dy)[(integer(kind=8)) (i + 3) + -1] + *da * (*dx)[(integer(kind=8)) (i + 3) + -1]; L.1:; i = i + 4; if (countm1.0 == 0) goto L.2; countm1.0 = countm1.0 + 4294967295; } L.2:;