On Mon, Feb 3, 2020 at 7:45 PM Jeff Law <[email protected]> wrote:
>
> On Fri, 2020-01-31 at 12:04 -0700, Martin Sebor wrote:
> > Attached is a reworked patch since the first one didn't go far
> > enough to solve the major problems. The new solution relies on
> > get_range_strlen_dynamic the same way as the sprintf optimization,
> > and does away with the determine_min_objsize function and calling
> > compute_builtin_object_size.
> >
> > To minimize testsuite fallout I extended get_range_strlen to handle
> > a couple more kinds expressions(*), but I still had to xfail and
> > disable a few tests that were relying on being able to use the type
> > of the destination object as the upper bound on the string length.
> >
> > Tested on x86_64-linux.
> >
> > Martin
> >
> > [*] With all the issues around MEM_REFs and types this change needs
> > extra scrutiny. I'm still not sure I fully understand what can and
> > what cannot be safely relied on at this level.
> >
> > On 1/15/20 6:18 AM, Martin Sebor wrote:
> > > The strcmp optimization newly introduced in GCC 10 relies on
> > > the size of the smallest referenced array object to determine
> > > whether the function can return zero. When the size of
> > > the object is smaller than the length of the other string
> > > argument the optimization folds the equality to false.
> > >
> > > The bug report has identified a couple of problems here:
> > > 1) when the access to the array object is via a pointer to
> > > a (possibly indirect) member of a union, in GIMPLE the pointer
> > > may actually point to a different member than the one in
> > > the original source code. Thus the size of the array may
> > > appear to be smaller than in the source code which can then
> > > result in the optimization being invalid.
> > > 2) when the pointer in the access may point to two or more
> > > arrays of different size (i.e., it's the result of a PHI),
> > > assuming it points to the smallest of them can also lead
> > > to an incorrect result when the optimization is applied.
> > >
> > > The attached patch adjusts the optimization to 1) avoid making
> > > any assumptions about the sizes of objects accessed via union
> > > types, and b) use the size of the largest object in PHI nodes.
> > >
> > > Tested on x86_64-linux.
> > >
> > > Martin
> >
> >
> > PR tree-optimization/92765 - wrong code for strcmp of a union member
> >
> > gcc/ChangeLog:
> >
> > PR tree-optimization/92765
> > * gimple-fold.c (get_range_strlen_tree): Handle MEM_REF and
> > PARM_DECL.
> > * tree-ssa-strlen.c (compute_string_length): Remove.
> > (determine_min_objsize): Remove.
> > (get_len_or_size): Add an argument. Call get_range_strlen_dynamic.
> > Avoid using type size as the upper bound on string length.
> > (handle_builtin_string_cmp): Add an argument. Adjust.
> > (strlen_check_and_optimize_call): Pass additional argument to
> > handle_builtin_string_cmp.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR tree-optimization/92765
> > * g++.dg/tree-ssa/strlenopt-1.C: New test.
> > * g++.dg/tree-ssa/strlenopt-2.C: New test.
> > * gcc.dg/Warray-bounds-58.c: New test.
> > * gcc.dg/Wrestrict-20.c: Avoid a valid -Wformat-overflow.
> > * gcc.dg/Wstring-compare.c: Xfail a test.
> > * gcc.dg/strcmpopt_2.c: Disable tests.
> > * gcc.dg/strcmpopt_4.c: Adjust tests.
> > * gcc.dg/strcmpopt_10.c: New test.
> > * gcc.dg/strlenopt-69.c: Disable tests.
> > * gcc.dg/strlenopt-92.c: New test.
> > * gcc.dg/strlenopt-93.c: New test.
> > * gcc.dg/strlenopt.h: Declare calloc.
> > * gcc.dg/tree-ssa/pr92056.c: Xfail tests until pr93518 is resolved.
> > * gcc.dg/tree-ssa/builtin-sprintf-warn-23.c: Correct test (pr93517).
> >
> > diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
> > index ed225922269..d70ac67e1ca 100644
> > --- a/gcc/gimple-fold.c
> > +++ b/gcc/gimple-fold.c
> > @@ -1280,7 +1280,7 @@ get_range_strlen_tree (tree arg, bitmap *visited,
> > strlen_range_kind rkind,
> > c_strlen_data *pdata, unsigned eltsize)
> > {
> > gcc_assert (TREE_CODE (arg) != SSA_NAME);
> > -
> > +
> > /* The length computed by this invocation of the function. */
> > tree val = NULL_TREE;
> >
> > @@ -1422,7 +1422,42 @@ get_range_strlen_tree (tree arg, bitmap *visited,
> > strlen_range_kind rkind,
> > type about the length here. */
> > tight_bound = true;
> > }
> > - else if (VAR_P (arg))
> > + else if (TREE_CODE (arg) == MEM_REF
> > + && TREE_CODE (TREE_TYPE (arg)) == ARRAY_TYPE
> > + && TREE_CODE (TREE_TYPE (TREE_TYPE (arg))) == INTEGER_TYPE
> > + && TREE_CODE (TREE_OPERAND (arg, 0)) == ADDR_EXPR)
> > + {
> > + /* Handle a MEM_REF into a DECL accessing an array of integers,
> > + being conservative about references to extern structures with
> > + flexible array members that can be initialized to arbitrary
> > + numbers of elements as an extension (static structs are okay).
> > + FIXME: Make this less conservative -- see
> > + component_ref_size in tree.c. */
> I think it's generally been agreed that we can look at sizes of _DECL
> nodes and this code doesn't look like this walks backwards through
> casts or anything like that. So the worry would be if we forward
> propagated through a cast into the MEM_REF node.
>
> It looks like forwprop only propagates through "compatible" pointer
> conversions. It makes me a bit nervous.
>
> Jakub/Richi, comments on this hunk?
+ tree ref = TREE_OPERAND (TREE_OPERAND (arg, 0), 0);
+ tree off = TREE_OPERAND (arg, 1);
+ if ((TREE_CODE (ref) == PARM_DECL || VAR_P (ref))
+ && (!DECL_EXTERNAL (ref)
+ || !array_at_struct_end_p (arg)))
+ {
I think you'd want decl_binds_to_current_def_p (ref) instead of !DECL_EXTERNAL.
Since 'arg' is originally a pointer array_at_struct_end_p is
meaningless here since
that's about the structure of a reference while the pointer is just a
value. So if
you're concerned the objects size might not be as it looks like then you have to
rely on decl_binds_to_current_def_p only. You also shouldn't use 'off' natively
in the code below but use mem_ref_offset to access the embedded offset
which is to be interpreted as signed integer (it's a pointer as you use it).
You compare it against an unsigned size...
>
>
> > diff --git a/gcc/testsuite/gcc.dg/Wstring-compare.c
> > b/gcc/testsuite/gcc.dg/Wstring-compare.c
> > index 0ca492db0ab..d1534bf7555 100644
> > --- a/gcc/testsuite/gcc.dg/Wstring-compare.c
> > +++ b/gcc/testsuite/gcc.dg/Wstring-compare.c
> In general I have a slight preference for pulling these into new files
> when we need to xfail them. Why? Because a test which previously
> passed, but now fails (even an xfail) causes the tester to flag the
> build as failing due to a testsuite regression.
>
> But I don't think that preference is significant enough to ask you to
> redo the work. Just something to ponder in the future.
>
>
> > diff --git a/gcc/testsuite/gcc.dg/strcmpopt_2.c
> > b/gcc/testsuite/gcc.dg/strcmpopt_2.c
> > index 57d8f651c28..f31761be173 100644
> > --- a/gcc/testsuite/gcc.dg/strcmpopt_2.c
> > +++ b/gcc/testsuite/gcc.dg/strcmpopt_2.c
> So I'd pulled f1, f3, f5, f7 into a new file. But disabling them like
> you've done is reasonable as well.
>
>
> > diff --git a/gcc/testsuite/gcc.dg/strcmpopt_4.c
> > b/gcc/testsuite/gcc.dg/strcmpopt_4.c
> > index 4e26522eed1..b07fbb6b7b0 100644
> > --- a/gcc/testsuite/gcc.dg/strcmpopt_4.c
> > +++ b/gcc/testsuite/gcc.dg/strcmpopt_4.c
> THanks for creating the new test. I'd done the exact same thing in my
> local tree. I'm a little surprised that f_param passes. Can you
> double check that?
>
>
>
> diff --git a/gcc/testsuite/gcc.dg/strlenopt-69.c
> b/gcc/testsuite/gcc.dg/strlenopt-69.c
> > index 9ad8e2e8aac..9df6eeccb97 100644
> > --- a/gcc/testsuite/gcc.dg/strlenopt-69.c
> > +++ b/gcc/testsuite/gcc.dg/strlenopt-69.c
> I'd pulled the offending tests into a new file, but your approach is
> fine too.
>
> >
> > diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
> > index ad9e98973b1..b9972c16e18 100644
> > --- a/gcc/tree-ssa-strlen.c
> > +++ b/gcc/tree-ssa-strlen.c
> > -
> > -/* Given strinfo IDX for ARG, set LENRNG[] to the range of lengths
> > - of the string(s) referenced by ARG if it can be determined.
> > - If the length cannot be determined, set *SIZE to the size of
> > +/* Given strinfo IDX for ARG, sets LENRNG[] to the range of lengths
> > + of the string(s) referenced by ARG if it can be determined.
> > + If the length cannot be determined, sets *SIZE to the size of
> > the array the string is stored in, if any. If no such array is
> > - known, set *SIZE to -1. When the strings are nul-terminated set
> > - *NULTERM to true, otherwise to false. Return true on success. */
> > + known, sets *SIZE to -1. When the strings are nul-terminated sets
> > + *NULTERM to true, otherwise to false. When nonnull uses RVALS to
> > + determine range information. Returns true on success. */
> "When nonnull uses RVALS to detemrine range information." That isn't a
> sentence and just doesn't seem to make sense. Please review for
> comment clarity.
>
> This looks OK to me with the comment fixed. I like that we drop the
> whole determine_min_objsize and replace it it the standard range bits
> that we're using elsewhere.
>
> Please give Richi and Jakub time to chime in on the gimple-fold.c
> changes.
>
>
> Jeff
>