The innermost loop in "j" cannot be vectorized because of the irregular code in that loop, i.e. the condition "IF ( l.NE.k )". But the cond expression is invariant in that loop, so the whole condition can be hoisted outside that loop, versioning the loop, and potentially allowing the vectorization of the innermost loop.
SUBROUTINE DGEFA(A,Lda,N,Ipvt,Info) INTEGER Lda , N , Ipvt(*) , Info DOUBLE PRECISION A(Lda,*) DOUBLE PRECISION t INTEGER IDAMAX , j , k , kp1 , l , nm1 Info = 0 nm1 = N - 1 IF ( nm1.GE.1 ) THEN DO k = 1 , nm1 kp1 = k + 1 l = IDAMAX(N-k+1,A(k,k),1) + k - 1 Ipvt(k) = l IF ( A(l,k).EQ.0.0D0 ) THEN Info = k ELSE IF ( l.NE.k ) THEN t = A(l,k) A(l,k) = A(k,k) A(k,k) = t ENDIF t = -1.0D0/A(k,k) CALL DSCAL(N-k,t,A(k+1,k),1) DO j = kp1 , N t = A(l,j) IF ( l.NE.k ) THEN A(l,j) = A(k,j) A(k,j) = t ENDIF CALL DAXPY(N-k,t,A(k+1,k),1,A(k+1,j),1) ENDDO ENDIF ENDDO ENDIF Ipvt(N) = N IF ( A(N,N).EQ.0.0D0 ) Info = N CONTINUE END The result of the vectorizer on this testcase is: /home/seb/ex/linpk.f90:24: note: not vectorized: too many BBs in loop. /home/seb/ex/linpk.f90:24: note: bad loop form. /home/seb/ex/linpk.f90:1: note: vectorized 0 loops in function. Okay, if I'm versioning that loop by hand, I get the same error due to the PRE as for capacita.f90: the PRE inserts in the loop->latch block some code: <bb 11>: # VUSE <PARM_NOALIAS.16_252> { PARM_NOALIAS.16 } pretmp.47_297 = *n_13(D); goto <bb 10>; And with PRE disabled, the fail occurs in the data ref analysis: ./linpk_corrected.f90:26: note: not vectorized: data ref analysis failed t.8_70 = (*a_25(D))[D.1406_69] ./linpk_corrected.f90:26: note: bad data references. -- Summary: Missed opportunities for vectorization due to invariant condition Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: spop at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33245