https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67167
Bug ID: 67167 Summary: cilkplus vectorization problems Product: gcc Version: 5.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: marcin.krotkiewski at gmail dot com Target Milestone: --- I think there is a problem with vectorization of arithmetic operations in the cilkplus implementation in gcc. I have inspected generated asm of the following two implementations of vector addition (a = a + b). The code is compiled with 'gcc -O3 -mavx -ftree-vectorize -fopt-info-vec -fcilkplus test.c'. // ICC compatibility - alignment hint #ifdef __GNUC__ #define __assume_aligned(lvalueptr, align) lvalueptr = __builtin_assume_aligned (lvalueptr, align) #endif #define RESTRICT __restrict__ typedef double Double; void test(Double * RESTRICT a, Double * RESTRICT b, int size) { int i; __assume_aligned(a, 64); __assume_aligned(b, 64); for(i=0; i<size; i++) a[i] = a[i] + b[i]; } void test_cilkplus1(Double * RESTRICT a, Double * RESTRICT b, int size) { __assume_aligned(a, 64); __assume_aligned(b, 64); a[0:size] = a[0:size] + b[0:size]; } The first code (test) is vectorized as expected - here comes the ASM: .L4: vmovapd (%rdi,%r8), %ymm0 addl $1, %r9d vaddpd (%rsi,%r8), %ymm0, %ymm0 vmovapd %ymm0, (%rdi,%r8) addq $32, %r8 cmpl %r9d, %ecx ja .L4 On the contrary, the second function (test_cilkplus1) is not vectorized: .L21: vmovsd (%rdi,%rax), %xmm0 movl %ecx, %r8d addl $1, %ecx vaddsd (%rsi,%rax), %xmm0, %xmm0 vmovsd %xmm0, (%rdi,%rax) addq $8, %rax cmpl %r8d, %edx jg .L21 Now I have made sure that the compiler understands that there is no aliasing (restrict) and that the vectors are aligned in memory. Clearly this is enough for the standard implementation to produce a vectorized code, but not for the CilkPlus array notation.