http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56595
Bug #: 56595 Summary: Tree-ssa-pre can create loop carried dependencies which prevent loop vectorization. Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: ysrum...@gmail.com In some cases pre can create loop carried dependencies spanning multiple iterations aka scalar replacement. This deficiency can be illustrated with attached test-case. After pre for stmt DO I = 0,I2 T1 = 0.5D0 * (U1(I,J,K) + U1(I+1,J,K)) pre creates loop carried dependence: <bb 172>: ... pretmp_690 = MEM[(real(kind=8)[0:] *)pretmp_675][pretmp_689]; ... <bb 107>: # i_1 = PHI <0(172), i_437(175)> # prephitmp_691 = PHI <pretmp_690(172), _440(175)> Note that in this particular test-case we have arrays with unknown stride1. If we have arrays with stride1 == 1 such transformation does not happen as for the following simple test-case which is successfully vectorized: subroutine bar(a,b,c,d,n, m) integer n, m real*8 a(n,*), b(n,*), c(n,*), d(n,*) do j=1,m do i=1,m x1 = 0.5 * (a(i,j) + a(i+1,j)) x2 = 0.5 * (b(i,j) + b(i+1,j)) x3 = 0.5 * (c(i,j) + c(i+1,j)) d(i,j) = (x1 + x2 + x3) / 3.0 enddo enddo end