https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117043
Bug ID: 117043 Summary: missed vectorization opportunity: data[i] = data[i - a[i]] + 1; (a[i]=0) Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: 652023330028 at smail dot nju.edu.cn Target Milestone: --- Hello, we noticed that there seems to be a missing vectorization for the code below. reduced code: https://godbolt.org/z/zxKhKWxGq int data[100]; void f(int a[100]) { for(int i = 0; i < 100; i++){ a[i] = 0; } for(int i = 0; i < 100; i++){ data[i] = data[i - a[i]] + 1; } } GCC -O3 -fno-vect-cost-model: f(int*): mov QWORD PTR [rdi], 0 mov rsi, rdi lea rdi, [rdi+8] xor eax, eax mov QWORD PTR [rdi+384], 0 mov rcx, rsi and rdi, -8 sub rcx, rdi add ecx, 400 shr ecx, 3 rep stosq .L2: mov edx, eax sub edx, DWORD PTR [rsi+rax*4] movsx rdx, edx mov edx, DWORD PTR data[0+rdx*4] add edx, 1 mov DWORD PTR data[0+rax*4], edx add rax, 1 cmp rax, 100 jne .L2 ret <bb 3> [local count: 1063004408]: # ivtmp.6_33 = PHI <ivtmp.6_37(3), 0(2)> # DEBUG i => NULL i_26 = (int) ivtmp.6_33; # DEBUG i => i_26 # DEBUG BEGIN_STMT _7 = MEM[(int *)a_16(D) + ivtmp.6_33 * 4]; _8 = i_26 - _7; _9 = data[_8]; _10 = _9 + 1; MEM[(int *)&data + ivtmp.6_33 * 4] = _10; Expected code: vectorize the loop body. For example: <bb 3> [local count: 265751102]: # ivtmp.14_19 = PHI <ivtmp.14_21(3), ivtmp.14_4(2)> # DEBUG BEGIN_STMT _5 = (void *) ivtmp.14_19; vect__4.7_22 = MEM <vector(4) int> [(int *)_5]; vect__5.8_20 = vect__4.7_22 + { 1, 1, 1, 1 }; MEM <vector(4) int> [(int *)_5] = vect__5.8_20; # DEBUG BEGIN_STMT # DEBUG BEGIN_STMT ivtmp.14_21 = ivtmp.14_19 + 16; if (_11 != ivtmp.14_21) goto <bb 3>; [95.96%] else goto <bb 4>; [4.04%] Thank you very much for your time and effort! We look forward to hearing from you.