Re: Question about vectorization limit

2013-05-31 Thread Xinliang David Li
yes, loop vectorizer relies on early passes to straighten out control flow (unswitch, index splitting, loop distribution, ifcvt etc). Intel ICC is pretty good at it. For the following simple made up case, icc vectorizes the loop. int a[1]; int b[1]; int foo (int n) { int i; for (i

Re: Question about vectorization limit

2013-05-31 Thread Xinliang David Li
On Fri, May 31, 2013 at 6:54 AM, Jakub Jelinek wrote: > On Fri, May 31, 2013 at 03:48:59PM +0200, Toon Moene wrote: >> >If you rewrite the above into: >> >SUBROUTINE XYZ(A, B, N) >> >DIMENSION A(N), B(N) >> >DO I = 1, N >> >C = B(I) >> >IF (A(I)> 0.0) THEN >> > A(I) = C / A(I) >> >

Re: Question about vectorization limit

2013-05-31 Thread Toon Moene
On 05/31/2013 03:54 PM, Jakub Jelinek wrote: > I wrote: But this "inner loop" has at least 3 basic blocks - so what does the "loop->num_nodes != 2" test exactly codify ? With the above testcase it has just 2. Before ifcvt pass it still has 4: Ah, I missed that subtle part. So my example is

Re: Question about vectorization limit

2013-05-31 Thread Jakub Jelinek
On Fri, May 31, 2013 at 03:48:59PM +0200, Toon Moene wrote: > >If you rewrite the above into: > >SUBROUTINE XYZ(A, B, N) > >DIMENSION A(N), B(N) > >DO I = 1, N > >C = B(I) > >IF (A(I)> 0.0) THEN > > A(I) = C / A(I) > >ELSE > > A(I) = C > >ENDIF > >ENDDO > >END > > > >th

Re: Question about vectorization limit

2013-05-31 Thread Richard Biener
On Fri, May 31, 2013 at 3:48 PM, Toon Moene wrote: > On 05/31/2013 03:41 PM, Jakub Jelinek wrote: > >> On Fri, May 31, 2013 at 03:21:51PM +0200, Toon Moene wrote: > > >>> SUBROUTINE XYZ(A, B, N) >>> DIMENSION A(N), B(N) >>> DO I = 1, N >>> IF (A(I)> 0.0) THEN >>>A(I) = B(I) / A(I) >>>

Re: Question about vectorization limit

2013-05-31 Thread Toon Moene
On 05/31/2013 03:41 PM, Jakub Jelinek wrote: On Fri, May 31, 2013 at 03:21:51PM +0200, Toon Moene wrote: SUBROUTINE XYZ(A, B, N) DIMENSION A(N), B(N) DO I = 1, N IF (A(I)> 0.0) THEN A(I) = B(I) / A(I) ELSE A(I) = B(I) ENDIF ENDDO END Well, in this case (with -Ofas

Re: Question about vectorization limit

2013-05-31 Thread Jakub Jelinek
On Fri, May 31, 2013 at 03:21:51PM +0200, Toon Moene wrote: > SUBROUTINE XYZ(A, B, N) > DIMENSION A(N), B(N) > DO I = 1, N >IF (A(I) > 0.0) THEN > A(I) = B(I) / A(I) >ELSE > A(I) = B(I) >ENDIF > ENDDO > END Well, in this case (with -Ofast) it is just the case that ifcvt or

Re: Question about vectorization limit

2013-05-31 Thread Richard Biener
On Fri, May 31, 2013 at 3:21 PM, Toon Moene wrote: > On 05/31/2013 10:20 AM, Richard Biener wrote: > >> So - I doubt that you both do not get any ICEs and more performance. > > > I added the second suggested patch: > > Index: tree-vect-loop-manip.c > ===

Re: Question about vectorization limit

2013-05-31 Thread Toon Moene
On 05/31/2013 10:20 AM, Richard Biener wrote: So - I doubt that you both do not get any ICEs and more performance. I added the second suggested patch: Index: tree-vect-loop-manip.c === --- tree-vect-loop-manip.c (revision 19

Re: Question about vectorization limit

2013-05-31 Thread Jakub Jelinek
On Fri, May 31, 2013 at 10:20:01AM +0200, Richard Biener wrote: > The limit is there because a loop with more than one basic-block with code > necessarily has to have conditionally executed BBs and eventually PHI nodes > at merge points. > > Now, it may be that we properly determine if we can hand

Re: Question about vectorization limit

2013-05-31 Thread Richard Biener
On Thu, May 30, 2013 at 2:46 AM, Dehao Chen wrote: > Hi, > > In tree-vect-loop.c, it limits the vectorization only to loops that have 2 > BBs: > > /* Inner-most loop. We currently require that the number of BBs is > exactly 2 (the header and latch). Vectorizable inner-most loops

Re: Question about vectorization limit

2013-05-30 Thread Dehao Chen
Actually, you need another patch to make this work: Index: gcc/tree-vect-loop-manip.c === --- gcc/tree-vect-loop-manip.c (revision 199416) +++ gcc/tree-vect-loop-manip.c (working copy) @@ -855,7 +855,6 @@ /* All loops have an o

Re: Question about vectorization limit

2013-05-30 Thread Toon Moene
On 05/30/2013 02:46 AM, Dehao Chen wrote: In tree-vect-loop.c, it limits the vectorization only to loops that have 2 BBs: /* Inner-most loop. We currently require that the number of BBs is exactly 2 (the header and latch). Vectorizable inner-most loops look like thi

Question about vectorization limit

2013-05-29 Thread Dehao Chen
Hi, In tree-vect-loop.c, it limits the vectorization only to loops that have 2 BBs: /* Inner-most loop. We currently require that the number of BBs is exactly 2 (the header and latch). Vectorizable inner-most loops look like this: (pre-header)