Re: Question about vectorization limit

Xinliang David Li Fri, 31 May 2013 12:34:02 -0700

yes, loop vectorizer relies on early passes to straighten out control
flow (unswitch, index splitting, loop distribution, ifcvt etc). Intel
ICC is pretty good at it. For the following simple made up case, icc
vectorizes the loop.


int a[10000];
int b[10000];

int foo (int n)
{
    int i;
    for (i = 0; i < 10000; i++)
     {
        if (a[i] > n)
          a[i] += b[i];
        else if (a[i] > 1000)
           a[i] -= b[i];
        else
           a[i] -= 1;
     }

   return n;

}

David

On Fri, May 31, 2013 at 6:41 AM, Jakub Jelinek <ja...@redhat.com> wrote:
> On Fri, May 31, 2013 at 03:21:51PM +0200, Toon Moene wrote:
>> SUBROUTINE XYZ(A, B, N)
>> DIMENSION A(N), B(N)
>> DO I = 1, N
>>    IF (A(I) > 0.0) THEN
>>       A(I) = B(I) / A(I)
>>    ELSE
>>       A(I) = B(I)
>>    ENDIF
>> ENDDO
>> END
>
> Well, in this case (with -Ofast) it is just the case that ifcvt
> or earlier passes did a poor job at moving the load from B(I)
> before the conditional, which, if we ignore exceptions, should be possible,
> as both branches read from the same memory.
> The store to A(I) is already hoisted by cselim out of the conditional.
>
> If you rewrite the above into:
> SUBROUTINE XYZ(A, B, N)
> DIMENSION A(N), B(N)
> DO I = 1, N
>    C = B(I)
>    IF (A(I) > 0.0) THEN
>       A(I) = C / A(I)
>    ELSE
>       A(I) = C
>    ENDIF
> ENDDO
> END
>
> then it is vectorized just fine.  Similarly even if this optimization
> isn't performed, with masked loads it should be optimizable.
> See http://gcc.gnu.org/ml/gcc-patches/2012-11/msg00202.html
> though we probably just want a better infrastructure for that.
>
>         Jakub

Re: Question about vectorization limit

Reply via email to