sjoerdmeijer wrote:

Yeah, we need numbers. 
I took the motivating example from the description, slightly modified it, and 
added a vectorisable loop:

    int test_simple_array(int i, int n, int * __restrict A, int * __restrict B) 
{
      int arr[10];
      for (int i = 0; i< n; ++i)
          arr[i] += A[i] * B[i];
       return arr[i];
    }

This gets vectorised, see: https://godbolt.org/z/GWb5h7hMb

With this patch locally applied, it is no longer vectorised, and I am getting 
the following with -Rpass-analysis=loop-vectorize:

    t.c:13:11: remark: loop not vectorized: unsafe dependent memory operations 
in loop. Use #pragma clang loop  distribute(enable) to allow loop
      distribution to attempt to isolate the offending operations into a 
separate loop 
    Unsafe indirect dependence. Memory location is the same as accessed at 
t.c:13:4 [-Rpass-analysis=loop-vectorize]

When I slightly modify the input example, and not let it accumulate, i.e. just 
have this:

    arr[i] = A[i] * B[i];

I am getting:

    t.c:13:11: remark: Recipe with invalid costs prevented vectorization at 
VF=(vscale x 1): store [-Rpass-analysis=loop-
    t.c:12:3: remark: the cost-model indicates that vectorization is not 
beneficial [-Rpass-analysis=loop-vectorize]

And this version gets also vectorised with unpatched clang. 

I hope I didn't make a silly mistake with this quick little exercise, but it 
looks like this gets into the way of vectorisation and that this doesn't bode 
very well for perf numbers....

https://github.com/llvm/llvm-project/pull/159046
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to