https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98813
Bug ID: 98813 Summary: loop is sub-optimized if index is unsigned int with offset Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: guojiufu at gcc dot gnu.org Target Milestone: --- For the below code: ---t.c---- void foo (const double* __restrict__ A, const double* __restrict__ B, double* __restrict__ C, int n, int k, int m) { for (unsigned int l_m = 0; l_m < m; l_m++) C[n + l_m] += A[k + l_m] * B[k]; } ------ compile with `gcc -O3 -S t.c -fopt-info`, we can see the loop was not vectorized because it may not safe to directly optimize with potential overflow. clang could vectorize this code, while there are run-time instructions to check if it is safe to do the optimization. ``` %1 = add nsw i64 %wide.trip.count, -1 = cnt-1 %2 = trunc i64 %1 to i32 = (int)(cnt-1) %3 = xor i32 %n, -1 = n xor -1 %4 = icmp ult i32 %3, %2 = (n xor -1) < (int)(cnt-1) %5 = icmp ugt i64 %1, 4294967295 = cnt > 4294967295 (overflow?) %6 = or i1 %4, %5 ```