On Mon, Feb 3, 2020 at 7:45 PM Jeff Law <l...@redhat.com> wrote: > > On Fri, 2020-01-31 at 12:04 -0700, Martin Sebor wrote: > > Attached is a reworked patch since the first one didn't go far > > enough to solve the major problems. The new solution relies on > > get_range_strlen_dynamic the same way as the sprintf optimization, > > and does away with the determine_min_objsize function and calling > > compute_builtin_object_size. > > > > To minimize testsuite fallout I extended get_range_strlen to handle > > a couple more kinds expressions(*), but I still had to xfail and > > disable a few tests that were relying on being able to use the type > > of the destination object as the upper bound on the string length. > > > > Tested on x86_64-linux. > > > > Martin > > > > [*] With all the issues around MEM_REFs and types this change needs > > extra scrutiny. I'm still not sure I fully understand what can and > > what cannot be safely relied on at this level. > > > > On 1/15/20 6:18 AM, Martin Sebor wrote: > > > The strcmp optimization newly introduced in GCC 10 relies on > > > the size of the smallest referenced array object to determine > > > whether the function can return zero. When the size of > > > the object is smaller than the length of the other string > > > argument the optimization folds the equality to false. > > > > > > The bug report has identified a couple of problems here: > > > 1) when the access to the array object is via a pointer to > > > a (possibly indirect) member of a union, in GIMPLE the pointer > > > may actually point to a different member than the one in > > > the original source code. Thus the size of the array may > > > appear to be smaller than in the source code which can then > > > result in the optimization being invalid. > > > 2) when the pointer in the access may point to two or more > > > arrays of different size (i.e., it's the result of a PHI), > > > assuming it points to the smallest of them can also lead > > > to an incorrect result when the optimization is applied. > > > > > > The attached patch adjusts the optimization to 1) avoid making > > > any assumptions about the sizes of objects accessed via union > > > types, and b) use the size of the largest object in PHI nodes. > > > > > > Tested on x86_64-linux. > > > > > > Martin > > > > > > PR tree-optimization/92765 - wrong code for strcmp of a union member > > > > gcc/ChangeLog: > > > > PR tree-optimization/92765 > > * gimple-fold.c (get_range_strlen_tree): Handle MEM_REF and > > PARM_DECL. > > * tree-ssa-strlen.c (compute_string_length): Remove. > > (determine_min_objsize): Remove. > > (get_len_or_size): Add an argument. Call get_range_strlen_dynamic. > > Avoid using type size as the upper bound on string length. > > (handle_builtin_string_cmp): Add an argument. Adjust. > > (strlen_check_and_optimize_call): Pass additional argument to > > handle_builtin_string_cmp. > > > > gcc/testsuite/ChangeLog: > > > > PR tree-optimization/92765 > > * g++.dg/tree-ssa/strlenopt-1.C: New test. > > * g++.dg/tree-ssa/strlenopt-2.C: New test. > > * gcc.dg/Warray-bounds-58.c: New test. > > * gcc.dg/Wrestrict-20.c: Avoid a valid -Wformat-overflow. > > * gcc.dg/Wstring-compare.c: Xfail a test. > > * gcc.dg/strcmpopt_2.c: Disable tests. > > * gcc.dg/strcmpopt_4.c: Adjust tests. > > * gcc.dg/strcmpopt_10.c: New test. > > * gcc.dg/strlenopt-69.c: Disable tests. > > * gcc.dg/strlenopt-92.c: New test. > > * gcc.dg/strlenopt-93.c: New test. > > * gcc.dg/strlenopt.h: Declare calloc. > > * gcc.dg/tree-ssa/pr92056.c: Xfail tests until pr93518 is resolved. > > * gcc.dg/tree-ssa/builtin-sprintf-warn-23.c: Correct test (pr93517). > > > > diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c > > index ed225922269..d70ac67e1ca 100644 > > --- a/gcc/gimple-fold.c > > +++ b/gcc/gimple-fold.c > > @@ -1280,7 +1280,7 @@ get_range_strlen_tree (tree arg, bitmap *visited, > > strlen_range_kind rkind, > > c_strlen_data *pdata, unsigned eltsize) > > { > > gcc_assert (TREE_CODE (arg) != SSA_NAME); > > - > > + > > /* The length computed by this invocation of the function. */ > > tree val = NULL_TREE; > > > > @@ -1422,7 +1422,42 @@ get_range_strlen_tree (tree arg, bitmap *visited, > > strlen_range_kind rkind, > > type about the length here. */ > > tight_bound = true; > > } > > - else if (VAR_P (arg)) > > + else if (TREE_CODE (arg) == MEM_REF > > + && TREE_CODE (TREE_TYPE (arg)) == ARRAY_TYPE > > + && TREE_CODE (TREE_TYPE (TREE_TYPE (arg))) == INTEGER_TYPE > > + && TREE_CODE (TREE_OPERAND (arg, 0)) == ADDR_EXPR) > > + { > > + /* Handle a MEM_REF into a DECL accessing an array of integers, > > + being conservative about references to extern structures with > > + flexible array members that can be initialized to arbitrary > > + numbers of elements as an extension (static structs are okay). > > + FIXME: Make this less conservative -- see > > + component_ref_size in tree.c. */ > I think it's generally been agreed that we can look at sizes of _DECL > nodes and this code doesn't look like this walks backwards through > casts or anything like that. So the worry would be if we forward > propagated through a cast into the MEM_REF node. > > It looks like forwprop only propagates through "compatible" pointer > conversions. It makes me a bit nervous. > > Jakub/Richi, comments on this hunk?
+ tree ref = TREE_OPERAND (TREE_OPERAND (arg, 0), 0); + tree off = TREE_OPERAND (arg, 1); + if ((TREE_CODE (ref) == PARM_DECL || VAR_P (ref)) + && (!DECL_EXTERNAL (ref) + || !array_at_struct_end_p (arg))) + { I think you'd want decl_binds_to_current_def_p (ref) instead of !DECL_EXTERNAL. Since 'arg' is originally a pointer array_at_struct_end_p is meaningless here since that's about the structure of a reference while the pointer is just a value. So if you're concerned the objects size might not be as it looks like then you have to rely on decl_binds_to_current_def_p only. You also shouldn't use 'off' natively in the code below but use mem_ref_offset to access the embedded offset which is to be interpreted as signed integer (it's a pointer as you use it). You compare it against an unsigned size... > > > > diff --git a/gcc/testsuite/gcc.dg/Wstring-compare.c > > b/gcc/testsuite/gcc.dg/Wstring-compare.c > > index 0ca492db0ab..d1534bf7555 100644 > > --- a/gcc/testsuite/gcc.dg/Wstring-compare.c > > +++ b/gcc/testsuite/gcc.dg/Wstring-compare.c > In general I have a slight preference for pulling these into new files > when we need to xfail them. Why? Because a test which previously > passed, but now fails (even an xfail) causes the tester to flag the > build as failing due to a testsuite regression. > > But I don't think that preference is significant enough to ask you to > redo the work. Just something to ponder in the future. > > > > diff --git a/gcc/testsuite/gcc.dg/strcmpopt_2.c > > b/gcc/testsuite/gcc.dg/strcmpopt_2.c > > index 57d8f651c28..f31761be173 100644 > > --- a/gcc/testsuite/gcc.dg/strcmpopt_2.c > > +++ b/gcc/testsuite/gcc.dg/strcmpopt_2.c > So I'd pulled f1, f3, f5, f7 into a new file. But disabling them like > you've done is reasonable as well. > > > > diff --git a/gcc/testsuite/gcc.dg/strcmpopt_4.c > > b/gcc/testsuite/gcc.dg/strcmpopt_4.c > > index 4e26522eed1..b07fbb6b7b0 100644 > > --- a/gcc/testsuite/gcc.dg/strcmpopt_4.c > > +++ b/gcc/testsuite/gcc.dg/strcmpopt_4.c > THanks for creating the new test. I'd done the exact same thing in my > local tree. I'm a little surprised that f_param passes. Can you > double check that? > > > > diff --git a/gcc/testsuite/gcc.dg/strlenopt-69.c > b/gcc/testsuite/gcc.dg/strlenopt-69.c > > index 9ad8e2e8aac..9df6eeccb97 100644 > > --- a/gcc/testsuite/gcc.dg/strlenopt-69.c > > +++ b/gcc/testsuite/gcc.dg/strlenopt-69.c > I'd pulled the offending tests into a new file, but your approach is > fine too. > > > > > diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c > > index ad9e98973b1..b9972c16e18 100644 > > --- a/gcc/tree-ssa-strlen.c > > +++ b/gcc/tree-ssa-strlen.c > > - > > -/* Given strinfo IDX for ARG, set LENRNG[] to the range of lengths > > - of the string(s) referenced by ARG if it can be determined. > > - If the length cannot be determined, set *SIZE to the size of > > +/* Given strinfo IDX for ARG, sets LENRNG[] to the range of lengths > > + of the string(s) referenced by ARG if it can be determined. > > + If the length cannot be determined, sets *SIZE to the size of > > the array the string is stored in, if any. If no such array is > > - known, set *SIZE to -1. When the strings are nul-terminated set > > - *NULTERM to true, otherwise to false. Return true on success. */ > > + known, sets *SIZE to -1. When the strings are nul-terminated sets > > + *NULTERM to true, otherwise to false. When nonnull uses RVALS to > > + determine range information. Returns true on success. */ > "When nonnull uses RVALS to detemrine range information." That isn't a > sentence and just doesn't seem to make sense. Please review for > comment clarity. > > This looks OK to me with the comment fixed. I like that we drop the > whole determine_min_objsize and replace it it the standard range bits > that we're using elsewhere. > > Please give Richi and Jakub time to chime in on the gimple-fold.c > changes. > > > Jeff >