https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66678
Bug ID: 66678 Summary: loop counter not accurately described by vrp Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: minor Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: vries at gcc dot gnu.org Target Milestone: --- consider testcase: ... void f (unsigned int n, unsigned int *__restrict__ a, unsigned int *__restrict__ b, unsigned int *__restrict__ c) { unsigned int i; for (i = 0; i < n; ++i) c[i] = a[i] + b[i]; } ... After vrp1 we have: ... f (unsigned intD.9 nD.2874, unsigned intD.9 * restrict aD.2875, unsigned intD.9 * restrict bD.2876, unsigned intD.9 * restrict cD.2877) { unsigned intD.9 iD.2880; long unsigned intD.10 _5; long unsigned intD.10 _6; unsigned intD.9 * _8; unsigned intD.9 * _10; unsigned intD.9 _11; unsigned intD.9 * _13; unsigned intD.9 _14; unsigned intD.9 _15; ;; basic block 2, loop depth 0, count 0, freq 900, maybe hot ;; prev block 0, next block 3, flags: (NEW, REACHABLE) ;; pred: ENTRY [100.0%] (FALLTHRU,EXECUTABLE) goto <bb 4>; ;; succ: 4 [100.0%] (FALLTHRU,EXECUTABLE) ;; basic block 3, loop depth 1, count 0, freq 9100, maybe hot ;; prev block 2, next block 4, flags: (NEW, REACHABLE) ;; pred: 4 [91.0%] (TRUE_VALUE,EXECUTABLE) # RANGE [0, 18446744073709551615] NONZERO 4294967295 _5 = (long unsigned intD.10) i_1; # RANGE [0, 18446744073709551615] NONZERO 18446744073709551612 _6 = _5 * 4; # PT = { D.2900 } (nonlocal) _8 = c_7(D) + _6; # PT = { D.2898 } (nonlocal) _10 = a_9(D) + _6; # VUSE <.MEM_2> _11 = MEM[(unsigned intD.9 *)_10 clique 1 base 2]; # PT = { D.2899 } (nonlocal) _13 = b_12(D) + _6; # VUSE <.MEM_2> _14 = MEM[(unsigned intD.9 *)_13 clique 1 base 3]; # RANGE [0, 4294967295] _15 = _11 + _14; # .MEM_16 = VDEF <.MEM_2> MEM[(unsigned intD.9 *)_8 clique 1 base 1] = _15; # RANGE [0, 4294967295] i_17 = i_1 + 1; ;; succ: 4 [100.0%] (FALLTHRU,DFS_BACK,EXECUTABLE) ;; basic block 4, loop depth 1, count 0, freq 10000, maybe hot ;; prev block 3, next block 5, flags: (NEW, REACHABLE) ;; pred: 2 [100.0%] (FALLTHRU,EXECUTABLE) ;; 3 [100.0%] (FALLTHRU,DFS_BACK,EXECUTABLE) # i_1 = PHI <0(2), i_17(3)> # .MEM_2 = PHI <.MEM_3(D)(2), .MEM_16(3)> if (i_1 < n_4(D)) goto <bb 3>; else goto <bb 5>; ;; succ: 3 [91.0%] (TRUE_VALUE,EXECUTABLE) ;; 5 [9.0%] (FALSE_VALUE,EXECUTABLE) ;; basic block 5, loop depth 0, count 0, freq 900, maybe hot ;; prev block 4, next block 1, flags: (NEW, REACHABLE) ;; pred: 4 [9.0%] (FALSE_VALUE,EXECUTABLE) # VUSE <.MEM_2> return; ;; succ: EXIT [100.0%] } ... AFAIU: - the loop iv i_1 has range [0, 4294967294], and - the loop iv increment i_17 has range [1, 4294967295] But in the dump resulting from vrp1, i_1 has no range assigned, and i_17 has RANGE [0, 4294967295] (which is equivalent to no range assigned). During vrp the pass inserts an assert at the start of bb 3: ... ;; basic block 3, loop depth 1, count 0, freq 9100, maybe hot ;; prev block 2, next block 4, flags: (NEW, REACHABLE) ;; pred: 4 [91.0%] (TRUE_VALUE,EXECUTABLE) i_18 = ASSERT_EXPR <i_1, i_1 < n_4(D)>; # RANGE [0, 18446744073709551615] NONZERO 4294967295 _5 = (long unsigned intD.10) i_18; # RANGE [0, 18446744073709551615] NONZERO 18446744073709551612 _6 = _5 * 4; # PT = { D.2900 } (nonlocal) _8 = c_7(D) + _6; # PT = { D.2898 } (nonlocal) _10 = a_9(D) + _6; # VUSE <.MEM_2> _11 = MEM[(unsigned intD.9 *)_10 clique 1 base 2]; # PT = { D.2899 } (nonlocal) _13 = b_12(D) + _6; # VUSE <.MEM_2> _14 = MEM[(unsigned intD.9 *)_13 clique 1 base 3]; _15 = _11 + _14; # .MEM_16 = VDEF <.MEM_2> MEM[(unsigned intD.9 *)_8 clique 1 base 1] = _15; i_17 = i_18 + 1; ;; succ: 4 [100.0%] (FALLTHRU,DFS_BACK,EXECUTABLE) ;; basic block 4, loop depth 1, count 0, freq 10000, maybe hot ;; prev block 3, next block 5, flags: (NEW, REACHABLE) ;; pred: 2 [100.0%] (FALLTHRU,EXECUTABLE) ;; 3 [100.0%] (FALLTHRU,DFS_BACK,EXECUTABLE) # i_1 = PHI <0(2), i_17(3)> # .MEM_2 = PHI <.MEM_3(D)(2), .MEM_16(3)> if (i_1 < n_4(D)) goto <bb 3>; else goto <bb 5>; ... When visiting the assert we get: ... Visiting statement: i_18 = ASSERT_EXPR <i_1, i_1 < n_4(D)>; Intersecting [0, n_4(D) + 4294967295] EQUIVALENCES: { i_1 } (1 elements) and [0, 0] to [0, n_4(D) + 4294967295] EQUIVALENCES: { i_1 } (1 elements) Found new range for i_18: [0, n_4(D) + 4294967295] ... AFAIU, if we have no information on n_4, then range [0, n_4(D) + 4294967295] is equal to [0, 4294967295] >From the assert however we can also derive a range of [0, 4294967294 ] given that i_1 < n_4 and n_4 is at most 4294967295. So, a more accurate range is [0, MIN (n_4(D) + 4294967295, 4294967294) ].