On 21 November 2016 at 15:10, Richard Biener <rguent...@suse.de> wrote:
> On Sun, 20 Nov 2016, Prathamesh Kulkarni wrote:
>
>> Hi,
>> As suggested by Martin in PR78153 strlen's return value cannot exceed
>> PTRDIFF_MAX.
>> So I set it's range to [0, PTRDIFF_MAX - 1] in extract_range_basic()
>> in the attached patch.
>>
>> However it regressed strlenopt-3.c:
>>
>> Consider fn1() from strlenopt-3.c:
>>
>> __attribute__((noinline, noclone)) size_t
>> fn1 (char *p, char *q)
>> {
>>   size_t s = strlen (q);
>>   strcpy (p, q);
>>   return s - strlen (p);
>> }
>>
>> The optimized dump shows the following:
>>
>> __attribute__((noclone, noinline))
>> fn1 (char * p, char * q)
>> {
>>   size_t s;
>>   size_t _7;
>>   long unsigned int _9;
>>
>>   <bb 2>:
>>   s_4 = strlen (q_3(D));
>>   _9 = s_4 + 1;
>>   __builtin_memcpy (p_5(D), q_3(D), _9);
>>   _7 = 0;
>>   return _7;
>>
>> }
>>
>> which introduces the regression, because the test expects "return 0;" in 
>> fn1().
>>
>> The issue seems to be in vrp2:
>>
>> Before the patch:
>> Visiting statement:
>> s_4 = strlen (q_3(D));
>> Found new range for s_4: VARYING
>>
>> Visiting statement:
>> _1 = s_4;
>> Found new range for _1: [s_4, s_4]
>> marking stmt to be not simulated again
>>
>> Visiting statement:
>> _7 = s_4 - _1;
>> Applying pattern match.pd:111, gimple-match.c:27997
>> Match-and-simplified s_4 - _1 to 0
>> Intersecting
>>   [0, 0]
>> and
>>   [0, +INF]
>> to
>>   [0, 0]
>> Found new range for _7: [0, 0]
>>
>> __attribute__((noclone, noinline))
>> fn1 (char * p, char * q)
>> {
>>   size_t s;
>>   long unsigned int _1;
>>   long unsigned int _9;
>>
>>   <bb 2>:
>>   s_4 = strlen (q_3(D));
>>   _9 = s_4 + 1;
>>   __builtin_memcpy (p_5(D), q_3(D), _9);
>>   _1 = s_4;
>>   return 0;
>>
>> }
>>
>>
>> After the patch:
>> Visiting statement:
>> s_4 = strlen (q_3(D));
>> Intersecting
>>   [0, 9223372036854775806]
>> and
>>   [0, 9223372036854775806]
>> to
>>   [0, 9223372036854775806]
>> Found new range for s_4: [0, 9223372036854775806]
>> marking stmt to be not simulated again
>>
>> Visiting statement:
>> _1 = s_4;
>> Intersecting
>>   [0, 9223372036854775806]  EQUIVALENCES: { s_4 } (1 elements)
>> and
>>   [0, 9223372036854775806]
>> to
>>   [0, 9223372036854775806]  EQUIVALENCES: { s_4 } (1 elements)
>> Found new range for _1: [0, 9223372036854775806]
>> marking stmt to be not simulated again
>>
>> Visiting statement:
>> _7 = s_4 - _1;
>> Intersecting
>>   ~[9223372036854775807, 9223372036854775809]
>> and
>>   ~[9223372036854775807, 9223372036854775809]
>> to
>>   ~[9223372036854775807, 9223372036854775809]
>> Found new range for _7: ~[9223372036854775807, 9223372036854775809]
>> marking stmt to be not simulated again
>>
>> __attribute__((noclone, noinline))
>> fn1 (char * p, char * q)
>> {
>>   size_t s;
>>   long unsigned int _1;
>>   size_t _7;
>>   long unsigned int _9;
>>
>>   <bb 2>:
>>   s_4 = strlen (q_3(D));
>>   _9 = s_4 + 1;
>>   __builtin_memcpy (p_5(D), q_3(D), _9);
>>   _1 = s_4;
>>   _7 = s_4 - _1;
>>   return _7;
>>
>> }
>>
>> Then forwprop4 turns
>> _1 = s_4
>> _7 = s_4 - _1
>> into
>> _7 = 0
>>
>> and we end up with:
>> _7 = 0
>> return _7
>> in optimized dump.
>>
>> Running ccp again after forwprop4 trivially solves the issue, however
>> I am not sure if we want to run ccp again ?
>>
>> The issue is probably with extract_range_from_ssa_name():
>> For _1 = s_4
>>
>> Before patch:
>> VR for s_4 is set to varying.
>> So VR for _1 is set to [s_4, s_4] by extract_range_from_ssa_name.
>> Since VR for _1 is [s_4, s_4] it implicitly implies that _1 is equal to s_4,
>> and vrp is able to transform _7 = s_4 - _1 to _7 = 0 (by using
>> match.pd pattern x - x -> 0).
>>
>> After patch:
>> VR for s_4 is set to [0, PTRDIFF_MAX - 1]
>> And correspondingly VR for _1 is set to [0, PTRDIFF_MAX - 1]
>> so IIUC, we then lose the information that _1 is equal to s_4,
>
> We don't lose it, it's in its set of equivalencies.
Ah, I missed that, thanks. For some reason I had mis-conception that
equivalences stores
variables which have same value-ranges but are not necessarily equal.
>
>> and vrp doesn't transform _7 = s_4 - _1 to _7 = 0.
>> forwprop4 does that because it sees that s_4 and _1 are equivalent.
>> Does this sound correct ?
>
> Yes.  So the issue is really that vrp_visit_assignment_or_call calls
> gimple_fold_stmt_to_constant_1 with vrp_valueize[_1] which when
> we do not have a singleton VR_RANGE does not fall back to looking
> at equivalences (there's not a good cheap way to do that currently because
> VRP doesn't keep a proper copy lattice but simply IORs equivalences
> from all equivalences).  In theory simply using the first set bit
> might work.  Thus sth like
>
> @@ -7057,6 +7030,12 @@ vrp_valueize (tree name)
>               || is_gimple_min_invariant (vr->min))
>           && vrp_operand_equal_p (vr->min, vr->max))
>         return vr->min;
> +      else if (vr->equiv && ! bitmap_empty_p (vr->equiv))
> +       {
> +         unsigned num = bitmap_first_set_bit (vr->equiv);
> +         if (num < SSA_NAME_VERSION (name))
> +           return ssa_name (num);
> +       }
>      }
>    return name;
>  }
>
> might work with the idea of simply doing canonicalization to one of
> the equivalences.  But as we don't allow copies in the SSA def stmt
> (via vrp_valueize_1) I'm not sure that's good enough canonicalization.
IIUC, we record the equivalent variables in vr->equiv
but do not canonicalize to one of the equivalence like "copy-of value"
in copyprop ?
Using first set bit unfortunately doesn't help for the above case.

Sorry if this sounds silly, should we just run copyprop/ccp once again
after vrp2 to ensure that there are no copies left ?
However that might be quite expensive ?
Or make vrp track copies like copyprop using a separate copy-of lattice ?

Thanks,
Prathamesh
>
> Richard.

Reply via email to