On 22 November 2016 at 20:18, Richard Biener <rguent...@suse.de> wrote:
> On Tue, 22 Nov 2016, Prathamesh Kulkarni wrote:
>
>> On 21 November 2016 at 15:10, Richard Biener <rguent...@suse.de> wrote:
>> > On Sun, 20 Nov 2016, Prathamesh Kulkarni wrote:
>> >
>> >> Hi,
>> >> As suggested by Martin in PR78153 strlen's return value cannot exceed
>> >> PTRDIFF_MAX.
>> >> So I set it's range to [0, PTRDIFF_MAX - 1] in extract_range_basic()
>> >> in the attached patch.
>> >>
>> >> However it regressed strlenopt-3.c:
>> >>
>> >> Consider fn1() from strlenopt-3.c:
>> >>
>> >> __attribute__((noinline, noclone)) size_t
>> >> fn1 (char *p, char *q)
>> >> {
>> >>   size_t s = strlen (q);
>> >>   strcpy (p, q);
>> >>   return s - strlen (p);
>> >> }
>> >>
>> >> The optimized dump shows the following:
>> >>
>> >> __attribute__((noclone, noinline))
>> >> fn1 (char * p, char * q)
>> >> {
>> >>   size_t s;
>> >>   size_t _7;
>> >>   long unsigned int _9;
>> >>
>> >>   <bb 2>:
>> >>   s_4 = strlen (q_3(D));
>> >>   _9 = s_4 + 1;
>> >>   __builtin_memcpy (p_5(D), q_3(D), _9);
>> >>   _7 = 0;
>> >>   return _7;
>> >>
>> >> }
>> >>
>> >> which introduces the regression, because the test expects "return 0;" in 
>> >> fn1().
>> >>
>> >> The issue seems to be in vrp2:
>> >>
>> >> Before the patch:
>> >> Visiting statement:
>> >> s_4 = strlen (q_3(D));
>> >> Found new range for s_4: VARYING
>> >>
>> >> Visiting statement:
>> >> _1 = s_4;
>> >> Found new range for _1: [s_4, s_4]
>> >> marking stmt to be not simulated again
>> >>
>> >> Visiting statement:
>> >> _7 = s_4 - _1;
>> >> Applying pattern match.pd:111, gimple-match.c:27997
>> >> Match-and-simplified s_4 - _1 to 0
>> >> Intersecting
>> >>   [0, 0]
>> >> and
>> >>   [0, +INF]
>> >> to
>> >>   [0, 0]
>> >> Found new range for _7: [0, 0]
>> >>
>> >> __attribute__((noclone, noinline))
>> >> fn1 (char * p, char * q)
>> >> {
>> >>   size_t s;
>> >>   long unsigned int _1;
>> >>   long unsigned int _9;
>> >>
>> >>   <bb 2>:
>> >>   s_4 = strlen (q_3(D));
>> >>   _9 = s_4 + 1;
>> >>   __builtin_memcpy (p_5(D), q_3(D), _9);
>> >>   _1 = s_4;
>> >>   return 0;
>> >>
>> >> }
>> >>
>> >>
>> >> After the patch:
>> >> Visiting statement:
>> >> s_4 = strlen (q_3(D));
>> >> Intersecting
>> >>   [0, 9223372036854775806]
>> >> and
>> >>   [0, 9223372036854775806]
>> >> to
>> >>   [0, 9223372036854775806]
>> >> Found new range for s_4: [0, 9223372036854775806]
>> >> marking stmt to be not simulated again
>> >>
>> >> Visiting statement:
>> >> _1 = s_4;
>> >> Intersecting
>> >>   [0, 9223372036854775806]  EQUIVALENCES: { s_4 } (1 elements)
>> >> and
>> >>   [0, 9223372036854775806]
>> >> to
>> >>   [0, 9223372036854775806]  EQUIVALENCES: { s_4 } (1 elements)
>> >> Found new range for _1: [0, 9223372036854775806]
>> >> marking stmt to be not simulated again
>> >>
>> >> Visiting statement:
>> >> _7 = s_4 - _1;
>> >> Intersecting
>> >>   ~[9223372036854775807, 9223372036854775809]
>> >> and
>> >>   ~[9223372036854775807, 9223372036854775809]
>> >> to
>> >>   ~[9223372036854775807, 9223372036854775809]
>> >> Found new range for _7: ~[9223372036854775807, 9223372036854775809]
>> >> marking stmt to be not simulated again
>> >>
>> >> __attribute__((noclone, noinline))
>> >> fn1 (char * p, char * q)
>> >> {
>> >>   size_t s;
>> >>   long unsigned int _1;
>> >>   size_t _7;
>> >>   long unsigned int _9;
>> >>
>> >>   <bb 2>:
>> >>   s_4 = strlen (q_3(D));
>> >>   _9 = s_4 + 1;
>> >>   __builtin_memcpy (p_5(D), q_3(D), _9);
>> >>   _1 = s_4;
>> >>   _7 = s_4 - _1;
>> >>   return _7;
>> >>
>> >> }
>> >>
>> >> Then forwprop4 turns
>> >> _1 = s_4
>> >> _7 = s_4 - _1
>> >> into
>> >> _7 = 0
>> >>
>> >> and we end up with:
>> >> _7 = 0
>> >> return _7
>> >> in optimized dump.
>> >>
>> >> Running ccp again after forwprop4 trivially solves the issue, however
>> >> I am not sure if we want to run ccp again ?
>> >>
>> >> The issue is probably with extract_range_from_ssa_name():
>> >> For _1 = s_4
>> >>
>> >> Before patch:
>> >> VR for s_4 is set to varying.
>> >> So VR for _1 is set to [s_4, s_4] by extract_range_from_ssa_name.
>> >> Since VR for _1 is [s_4, s_4] it implicitly implies that _1 is equal to 
>> >> s_4,
>> >> and vrp is able to transform _7 = s_4 - _1 to _7 = 0 (by using
>> >> match.pd pattern x - x -> 0).
>> >>
>> >> After patch:
>> >> VR for s_4 is set to [0, PTRDIFF_MAX - 1]
>> >> And correspondingly VR for _1 is set to [0, PTRDIFF_MAX - 1]
>> >> so IIUC, we then lose the information that _1 is equal to s_4,
>> >
>> > We don't lose it, it's in its set of equivalencies.
>> Ah, I missed that, thanks. For some reason I had mis-conception that
>> equivalences stores
>> variables which have same value-ranges but are not necessarily equal.
>> >
>> >> and vrp doesn't transform _7 = s_4 - _1 to _7 = 0.
>> >> forwprop4 does that because it sees that s_4 and _1 are equivalent.
>> >> Does this sound correct ?
>> >
>> > Yes.  So the issue is really that vrp_visit_assignment_or_call calls
>> > gimple_fold_stmt_to_constant_1 with vrp_valueize[_1] which when
>> > we do not have a singleton VR_RANGE does not fall back to looking
>> > at equivalences (there's not a good cheap way to do that currently because
>> > VRP doesn't keep a proper copy lattice but simply IORs equivalences
>> > from all equivalences).  In theory simply using the first set bit
>> > might work.  Thus sth like
>> >
>> > @@ -7057,6 +7030,12 @@ vrp_valueize (tree name)
>> >               || is_gimple_min_invariant (vr->min))
>> >           && vrp_operand_equal_p (vr->min, vr->max))
>> >         return vr->min;
>> > +      else if (vr->equiv && ! bitmap_empty_p (vr->equiv))
>> > +       {
>> > +         unsigned num = bitmap_first_set_bit (vr->equiv);
>> > +         if (num < SSA_NAME_VERSION (name))
>> > +           return ssa_name (num);
>> > +       }
>> >      }
>> >    return name;
>> >  }
>> >
>> > might work with the idea of simply doing canonicalization to one of
>> > the equivalences.  But as we don't allow copies in the SSA def stmt
>> > (via vrp_valueize_1) I'm not sure that's good enough canonicalization.
>> IIUC, we record the equivalent variables in vr->equiv
>> but do not canonicalize to one of the equivalence like "copy-of value"
>> in copyprop ?
>> Using first set bit unfortunately doesn't help for the above case.
>>
>> Sorry if this sounds silly, should we just run copyprop/ccp once again
>> after vrp2 to ensure that there are no copies left ?
>
> why?  forwprop also does copy and constant propagation.  For the
> regression simply adjust the pass dump you scan.
Well, with the patch the redundant store to and load from _7 still remains
in optimized dump for fn1() in strlenopt-3.c:

__attribute__((noclone, noinline))
fn1 (char * p, char * q)
{
  size_t s;
  size_t _7;
  long unsigned int _9;

  <bb 2>:
  s_4 = strlen (q_3(D));
  _9 = s_4 + 1;
  __builtin_memcpy (p_5(D), q_3(D), _9);
  _7 = 0;
  return _7;

}

Running ccp again after forwprop4 would get rid of _7.
Without the patch we have return _0; in optimized dump.

Thanks,
Prathamesh
>
>> However that might be quite expensive ?
>> Or make vrp track copies like copyprop using a separate copy-of lattice ?
>
> Ideally we'd unify the three SSA propagation passes into one.  We'd
> have to have separate lattices for copy&constant and range&known-bits.
>
> Richard.
>
>> Thanks,
>> Prathamesh
>> >
>> > Richard.
>>
>>
>
> --
> Richard Biener <rguent...@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
> 21284 (AG Nuernberg)

Reply via email to