sjoerdmeijer wrote:
Yeah, we need numbers.
I took the motivating example from the description, slightly modified it, and
added a vectorisable loop:
int test_simple_array(int i, int n, int * __restrict A, int * __restrict B)
{
int arr[10];
for (int i = 0; i< n; ++i)
arr[i] += A[i] * B[i];
return arr[i];
}
This gets vectorised, see: https://godbolt.org/z/GWb5h7hMb
With this patch locally applied, it is no longer vectorised, and I am getting
the following with -Rpass-analysis=loop-vectorize:
t.c:13:11: remark: loop not vectorized: unsafe dependent memory operations
in loop. Use #pragma clang loop distribute(enable) to allow loop
distribution to attempt to isolate the offending operations into a
separate loop
Unsafe indirect dependence. Memory location is the same as accessed at
t.c:13:4 [-Rpass-analysis=loop-vectorize]
When I slightly modify the input example, and not let it accumulate, i.e. just
have this:
arr[i] = A[i] * B[i];
I am getting:
t.c:13:11: remark: Recipe with invalid costs prevented vectorization at
VF=(vscale x 1): store [-Rpass-analysis=loop-
t.c:12:3: remark: the cost-model indicates that vectorization is not
beneficial [-Rpass-analysis=loop-vectorize]
And this version gets also vectorised with unpatched clang.
I hope I didn't make a silly mistake with this quick little exercise, but it
looks like this gets into the way of vectorisation and that this doesn't bode
very well for perf numbers....
https://github.com/llvm/llvm-project/pull/159046
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits