On Sun, 20 Nov 2016, Prathamesh Kulkarni wrote: > Hi, > As suggested by Martin in PR78153 strlen's return value cannot exceed > PTRDIFF_MAX. > So I set it's range to [0, PTRDIFF_MAX - 1] in extract_range_basic() > in the attached patch. > > However it regressed strlenopt-3.c: > > Consider fn1() from strlenopt-3.c: > > __attribute__((noinline, noclone)) size_t > fn1 (char *p, char *q) > { > size_t s = strlen (q); > strcpy (p, q); > return s - strlen (p); > } > > The optimized dump shows the following: > > __attribute__((noclone, noinline)) > fn1 (char * p, char * q) > { > size_t s; > size_t _7; > long unsigned int _9; > > <bb 2>: > s_4 = strlen (q_3(D)); > _9 = s_4 + 1; > __builtin_memcpy (p_5(D), q_3(D), _9); > _7 = 0; > return _7; > > } > > which introduces the regression, because the test expects "return 0;" in > fn1(). > > The issue seems to be in vrp2: > > Before the patch: > Visiting statement: > s_4 = strlen (q_3(D)); > Found new range for s_4: VARYING > > Visiting statement: > _1 = s_4; > Found new range for _1: [s_4, s_4] > marking stmt to be not simulated again > > Visiting statement: > _7 = s_4 - _1; > Applying pattern match.pd:111, gimple-match.c:27997 > Match-and-simplified s_4 - _1 to 0 > Intersecting > [0, 0] > and > [0, +INF] > to > [0, 0] > Found new range for _7: [0, 0] > > __attribute__((noclone, noinline)) > fn1 (char * p, char * q) > { > size_t s; > long unsigned int _1; > long unsigned int _9; > > <bb 2>: > s_4 = strlen (q_3(D)); > _9 = s_4 + 1; > __builtin_memcpy (p_5(D), q_3(D), _9); > _1 = s_4; > return 0; > > } > > > After the patch: > Visiting statement: > s_4 = strlen (q_3(D)); > Intersecting > [0, 9223372036854775806] > and > [0, 9223372036854775806] > to > [0, 9223372036854775806] > Found new range for s_4: [0, 9223372036854775806] > marking stmt to be not simulated again > > Visiting statement: > _1 = s_4; > Intersecting > [0, 9223372036854775806] EQUIVALENCES: { s_4 } (1 elements) > and > [0, 9223372036854775806] > to > [0, 9223372036854775806] EQUIVALENCES: { s_4 } (1 elements) > Found new range for _1: [0, 9223372036854775806] > marking stmt to be not simulated again > > Visiting statement: > _7 = s_4 - _1; > Intersecting > ~[9223372036854775807, 9223372036854775809] > and > ~[9223372036854775807, 9223372036854775809] > to > ~[9223372036854775807, 9223372036854775809] > Found new range for _7: ~[9223372036854775807, 9223372036854775809] > marking stmt to be not simulated again > > __attribute__((noclone, noinline)) > fn1 (char * p, char * q) > { > size_t s; > long unsigned int _1; > size_t _7; > long unsigned int _9; > > <bb 2>: > s_4 = strlen (q_3(D)); > _9 = s_4 + 1; > __builtin_memcpy (p_5(D), q_3(D), _9); > _1 = s_4; > _7 = s_4 - _1; > return _7; > > } > > Then forwprop4 turns > _1 = s_4 > _7 = s_4 - _1 > into > _7 = 0 > > and we end up with: > _7 = 0 > return _7 > in optimized dump. > > Running ccp again after forwprop4 trivially solves the issue, however > I am not sure if we want to run ccp again ? > > The issue is probably with extract_range_from_ssa_name(): > For _1 = s_4 > > Before patch: > VR for s_4 is set to varying. > So VR for _1 is set to [s_4, s_4] by extract_range_from_ssa_name. > Since VR for _1 is [s_4, s_4] it implicitly implies that _1 is equal to s_4, > and vrp is able to transform _7 = s_4 - _1 to _7 = 0 (by using > match.pd pattern x - x -> 0). > > After patch: > VR for s_4 is set to [0, PTRDIFF_MAX - 1] > And correspondingly VR for _1 is set to [0, PTRDIFF_MAX - 1] > so IIUC, we then lose the information that _1 is equal to s_4,
We don't lose it, it's in its set of equivalencies. > and vrp doesn't transform _7 = s_4 - _1 to _7 = 0. > forwprop4 does that because it sees that s_4 and _1 are equivalent. > Does this sound correct ? Yes. So the issue is really that vrp_visit_assignment_or_call calls gimple_fold_stmt_to_constant_1 with vrp_valueize[_1] which when we do not have a singleton VR_RANGE does not fall back to looking at equivalences (there's not a good cheap way to do that currently because VRP doesn't keep a proper copy lattice but simply IORs equivalences from all equivalences). In theory simply using the first set bit might work. Thus sth like @@ -7057,6 +7030,12 @@ vrp_valueize (tree name) || is_gimple_min_invariant (vr->min)) && vrp_operand_equal_p (vr->min, vr->max)) return vr->min; + else if (vr->equiv && ! bitmap_empty_p (vr->equiv)) + { + unsigned num = bitmap_first_set_bit (vr->equiv); + if (num < SSA_NAME_VERSION (name)) + return ssa_name (num); + } } return name; } might work with the idea of simply doing canonicalization to one of the equivalences. But as we don't allow copies in the SSA def stmt (via vrp_valueize_1) I'm not sure that's good enough canonicalization. Richard.