On Fri, May 31, 2013 at 03:21:51PM +0200, Toon Moene wrote: > SUBROUTINE XYZ(A, B, N) > DIMENSION A(N), B(N) > DO I = 1, N > IF (A(I) > 0.0) THEN > A(I) = B(I) / A(I) > ELSE > A(I) = B(I) > ENDIF > ENDDO > END
Well, in this case (with -Ofast) it is just the case that ifcvt or earlier passes did a poor job at moving the load from B(I) before the conditional, which, if we ignore exceptions, should be possible, as both branches read from the same memory. The store to A(I) is already hoisted by cselim out of the conditional. If you rewrite the above into: SUBROUTINE XYZ(A, B, N) DIMENSION A(N), B(N) DO I = 1, N C = B(I) IF (A(I) > 0.0) THEN A(I) = C / A(I) ELSE A(I) = C ENDIF ENDDO END then it is vectorized just fine. Similarly even if this optimization isn't performed, with masked loads it should be optimizable. See http://gcc.gnu.org/ml/gcc-patches/2012-11/msg00202.html though we probably just want a better infrastructure for that. Jakub