On 11/11/20 6:09 PM, Martin Sebor via Gcc-patches wrote:
> The attached patch builds on top of the series I posted last
> week(*) to improve the detection of out of bounds pointers
> and C++ references, as well as a subset of invalid pointer
> relational and subtraction expressions.
>
> First, as I mentioned last week, the simple compute_objsize
> cache I implemented then is space inefficient.  The changes
> in this update enhance the  cache to reduce the space overhead
> and improve compile-time efficiency.  The cache now consists
> of two arrays, one storing the indices to the other.
> The former is indexed by SSA_NAME version.  The latter also
> contains separate entries for sizes of enclosing objects and
> their members (missing from the first attempt, leading to
> inefficient hacks to overcome the limitation).   These
> improvements let clients look up the provenance of any pointer
> in O(1) time and avoid the hacks.
>
> Second, with the necessary cache improvements above,
> the gimple-array-bounds changes enhance array bounds checking
> in two ways:
> 1) pointers passed to functions or used to initialize C++
> references are checked to see if they're valid and in bounds
> and diagnosed if not (a subset of instances of passing valid
> just-past-past-the-end and so non-dereferenceable pointers to
> functions are also diagnosed)
> 2) relational or difference expressions involving pointers are
> checked to make sure they point to the same object and diagnosed
> if not.
>
> Besides bootstrapping and regtesting I have also tested the full
> patch series with a few packages, including Binutils/GDB, Glibc
> and Valgrind, and verified that it doesn't cause any false
> positives.  The new -Wpointer-compare warning does trigger in
> two or three places in each, for subtracting pointers to distinct
> objects.  Those are true positives, but the code I checked looks
> benign.  In such cases the warning can be suppressed by converting
> the pointers to intptr_t before the subtraction.
>
> Martin
>
> [*] Prerequisite patches:
> https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558127.html
> https://gcc.gnu.org/pipermail/gcc-patches/2020-November/557807.html
> https://gcc.gnu.org/pipermail/gcc-patches/2020-November/557987.html
>
> gcc-wpointer-compare.diff
>
> Improve detection of invalid pointer operations.
>
> gcc/ChangeLog:
>
>       PR middle-end/93848
>       * builtins.c (compute_objsize_r): Add argument.
>       (access_ref::parm): New function.
>       (access_ref::get_ref): Add argument.  Avoid null pointers.  Treat
>       multiple PHIs with all arguments referencing the same object the same
>       as references to that object with a range of offsets.  Clear BASE0
>       for PHIs with arguments referencing distinct objects and different
>       offsets.
>       (access_ref::get_ref_type): New function.
>       (access_ref::size_remaining): Make sure low bound of size is
>       nonnegative.
>       (pointer_query::pointer_query): Adjust ctor.
>       (pointer_query::get_ref): Handle separate index and var caches.
>       (pointer_query::put_ref): Same.
>       (pointer_query::flush_cache): New function.
>       (access_ref::inform_access): Differentiate pointers to allocated
>       objects from those returned by non-allocation functions.
>       (get_offset_range): Make extern.
>       (handle_min_max_size): Add argument.  Pass it to compute_objsize.
>       (compute_objsize_r, compute_objsize): Same.
>       (gimple_call_alloc_p): Make extern.
>       (maybe_emit_free_warning): Adjust and simplify test for built-in
>       functions.
>       * builtins.h (struct access_ref): Add new members, adjust existing.
>       (get_offset_range): Declare.
>       (gimple_call_alloc_p): Same.
>       (compute_objsize): Add argument.
>       * common.opt (-Wpointer-compare): Move option here from gcc/c.opt.
>       * doc/invoke.texi (-Wpointer-compare): Update.
>       * gimple-array-bounds.cc (ptrrefs): New variable.
>       (format_offset_range): New function.
>       (format_subscript_range): New function.
>       (array_bounds_checker::check_array_ref): Remove argument.
>       (array_bounds_checker::check_mem_ref): Remove argument.  Check
>       no-warning bit on the current statement.  Use format_subscript_range.
>       Simplify informational messages.
>       (array_bounds_checker::check_addr_expr): Remove redundant argument
>       from calls.
>       (array_bounds_checker::ptrs_to_distinct): New function.
>       (array_bounds_checker::check_pointer_op): Same.
>       (array_bounds_checker::handle_assign): Same.
>       (array_bounds_checker::handle_call): Same.
>       (array_bounds_checker::check_array_bounds): Set new
>       array_bounds_checker members.  Adjust calls.
>       (check_array_bounds_dom_walker::before_dom_children): Call new
>       array_bounds_checker members.  Set statement visited bit.
>       (array_bounds_checker::check): Create and a pointer_query instance
>       and flush its cache.
>       * gimple-array-bounds.h (class array_bounds_checker): Add new members.
>       * tree-ssa-strlen.c (maybe_warn_overflow): Pass a new argument to
>       compute_objsize.
>       (strlen_dom_walker::strlen_dom_walker): Initialize cache.  Adjust
>       member type.
>       (printf_strlen_execute): Improve debugging output.
>
> gcc/c-family/ChangeLog:
>
>       PR middle-end/93848
>       * c.opt (-Wpointer-compare): Move option to gcc/common.opt.
>
> gcc/c/ChangeLog:
>
>       PR middle-end/93848
>       * c-typeck.c (parser_build_binary_op): Adjust option name.
>
> gcc/cp/ChangeLog:
>
>       PR middle-end/93848
>       * typeck.c (cp_build_binary_op): Adjust option name.
>
> gcc/testsuite/ChangeLog:
>
>       PR middle-end/93848
>       * c-c++-common/Warray-bounds-3.c: Adjust text of expected warnings.
>       * g++.dg/warn/Warray-bounds.C: Same.
>       * g++.dg/warn/delete-array-1.C: Same.
>       * gcc.dg/Walloc-size-larger-than-18.c: Convert pointers to integers
>       to avoid expected warnings.
>       * gcc.dg/Warray-bounds-29.c: Same.  Remove xfails.
>       * gcc.dg/Warray-bounds-31.c: Same.  Add exoecred warnings.
>       * gcc.dg/Warray-bounds-32.c: Add an expected warning.
>       * gcc.dg/Warray-bounds-41.c: Remove xfails.
>       * gcc.dg/Warray-bounds-46.c: Simplify test for warning.
>       * gcc.dg/Warray-bounds-58.c: Add expected warnings.
>       * gcc.dg/Warray-bounds-66.c: Adjust text of expected messages.
>       * gcc.dg/Warray-bounds.c: Same.
>       * gcc.dg/Wreturn-local-addr-8.c: Suppress -Wpointer-compare=2.
>       * gcc.dg/Wstringop-overflow-23.c: Add expected warnings.
>       * gcc.dg/attr-alloc_size-3.c: Convert pointers to integers to avoid
>       expected warnings.
>       * gcc.dg/attr-alloc_size-4.c: Same.
>       * gcc.dg/attr-alloc_size-5.c: Same.
>       * gcc.dg/pr83044.c: Add expected warning.
>       * gcc.dg/strlenopt-57.c: Expect either kind of warning.
>       * gcc.dg/tree-ssa/builtin-sprintf-warn-18.c: Suppress (otherwise
>       expected) -Warray-bounds.
>       * g++.dg/warn/Warray-bounds-14.C: New test.
>       * g++.dg/warn/Wmismatched-new-delete-2.C: New test.
>       * gcc.dg/Warray-bounds-69.c: New test.
>       * gcc.dg/Warray-bounds-71.c: New test.
>       * gcc.dg/Warray-bounds-72.c: New test.
>       * gcc.dg/Warray-bounds-73.c: New test.
>       * gcc.dg/Wpointer-compare-2.c: New test.
>       * gcc.dg/Wpointer-compare-3.c: New test.
>       * gcc.dg/Wpointer-compare-4.c: New test.
>       * gcc.dg/Wpointer-compare.c: New test.
So the worry naturally is that we're late in the game.  But I think we
can reasonably go forward after a degree of testing.

I think you've got non-unique testnames in this patch.  Let's take one
example from Wpointer-compare-4.c:



> +  // { dg-warning "subtracting pointers to distinct objects" "" { xfail 
> *-*-* } .-1 }
> +  // { dg-message "operand 1 is a null pointer constant" "" { xfail *-*-* } 
> .-2 }

Remember, the testname is composed from the filename, the testname in
the dg-whatever (empty strings in both cases above" and the line number
(.-1 and .-2 above both ultimately refer to the same line).  The string
you're searching for does not factor into the testname.

It's possible this specific instance may be OK because of the dg-warning
vs dg-message difference.  But there's others that both use dg-message.

What I would recommend is reviewing the testsuite changes and looking at
each ".-" instance to ensure they're unique.

@@ -224,6 +225,21 @@ access_ref::access_ref (tree bound /* = NULL_TREE */,
>      }
>  }
>  
> +/* Return the PARM_DECL that REF refers to or null if it doesn't.  */
> +
> +tree
> +access_ref::parm () const
> +{
> +  if (!ref || TREE_CODE (ref) != SSA_NAME)
> +    return NULL;
> +
> +  gimple *def_stmt = SSA_NAME_DEF_STMT (ref);
> +  if (!gimple_nop_p (def_stmt))
> +    return NULL;
> +
> +  return SSA_NAME_VAR (ref);
> +}
So I don't see anything in the code that ensures that REF is a
PARM_DECL.   Are these access_ref objects only created when we walk over
the function's PARM_DECLs?    I'm guessing that's the case and that I
just missed it.

Also note that convention is supposed to tag class members with a m_
prefix.  I don't think we're strict about that, but it would certainly
have helped me looking for how these things get set up.  Just searching
for "ref" generates too many unrelated hits.   Please try to use that
convention in the future.

@@ -320,6 +349,11 @@ access_ref::get_ref (vec<access_ref> *all_refs,
>        if (phi_known_size && phi_arg_ref.sizrng[0] < minsize)
>       minsize = phi_arg_ref.sizrng[0];
>  
> +      /* Disregard null pointers in PHIs with two or more arguments.
> +      TODO: Handle this better!  */
> +      if (nullp)
> +     continue;
So a PHI with a single NULL argument is degenerate and should be
propagated away.  If one is still showing up, then I think returning the
equivalent of a "don't know" for all the analysis in here is probably
reasonable.

But what should the semantics be when there are multiple PHI arguments,
one or more of which are NULL?  Does it even make sense to do array
bounds checking in that case?  Presumably we're doing bounds checking
because either we're computing an address and trying to decide if its a
valid address or we're doing an actual memory reference and a NULL
pointer should be faulting in that case.

And just so I understand.  It looks like if a PHI has multiple arguments
that use the argument with the most space remaining for analysis.  Given
two args with the same size, we take the larger one as a tie breaker. 
That means we can miss warnings because we test the index against the
argument with the most remaining size or largest size, right?  I'm not
saying that's the wrong thing to do, I just want to make sure i
understand the mode of operation here.  In fact, it's probably the right
thing to do in terms of minimizing false positives.

On the subject of caching.  I had a conversation with Andrew on this
last week.  There's a reasonable chance he's going to be generalizing
Ranger's cache during gcc-12 and that we'll be able to use it as a
fairly generic caching mechanism.  So I wouldn't spend a ton of time on
adjusting/improving this implementation.   I'm assuming that your cache
is context insensitive, ie the values you're storing into it are global
values and not specific to a path or dominated subtree, right?


> @@ -5402,24 +5582,33 @@ compute_objsize_r (tree ptr, access_ref *pref, 
> pointer_query *qry,
>        return true;
>      }
>  
> -  if (code == VIEW_CONVERT_EXPR)
> +  if (code == ASSERT_EXPR || code == VIEW_CONVERT_EXPR)
Are you really seeing ASSERT_EXPRs?  Those should only exist during VRP
and are generally to be avoided.  I'm not objecting to this, but mostly
want to make you aware that ASSERT_EXPRs are on their way out and if
you're relying heavily on them that's probably a good sign that we're
going to want to revisit this in a Ranger world.

Nit:
s/argumnent/argument/

For your builtins.c changes.

s/curret/current/
s/expressopns/expressions/

For your gimple-array-bounds.cc changes.



I'm going to ask that you not commit right now.  I'm going to allocate a
large fleet of testing machines (probably a few dozen) and get a
baseline test done as quickly as possible, then a second run with this
patch in its current state to get a sense of the fallout.  Most of the
fallout from the prereqs has already been addressed as well as other
bugs that were causing false positives -- so I'm hoping this patch
actually doesn't have a lot of fallout at this point.

jeff

Reply via email to