On 11/6/19 1:27 PM, Martin Sebor wrote:
> On 11/6/19 11:55 AM, Jeff Law wrote:
>> On 11/6/19 11:00 AM, Martin Sebor wrote:
>>> The -Wstringop-overflow warnings for single-byte and multi-byte
>>> stores mention the amount of data being stored and the amount of
>>> space remaining in the destination, such as:
>>>
>>> warning: writing 4 bytes into a region of size 0 [-Wstringop-overflow=]
>>>    123 |   *p = 0;
>>>        |   ~~~^~~
>>> note: destination object declared here
>>>     45 |   char b[N];
>>>        |        ^
>>>
>>> A warning like this can take some time to analyze.  First, the size
>>> of the destination isn't mentioned and may not be easy to tell from
>>> the sources.  In the note above, when N's value is the result of
>>> some non-trivial computation, chasing it down may be a small project
>>> in and of itself.  Second, it's also not clear why the region size
>>> is zero.  It could be because the offset is exactly N, or because
>>> it's negative, or because it's in some range greater than N.
>>>
>>> Mentioning both the size of the destination object and the offset
>>> makes the existing messages clearer, are will become essential when
>>> GCC starts diagnosing overflow into allocated buffers (as my
>>> follow-on patch does).
>>>
>>> The attached patch enhances -Wstringop-overflow to do this by
>>> letting compute_objsize return the offset to its caller, doing
>>> something similar in get_stridx, and adding a new function to
>>> the strlen pass to issue this enhanced warning (eventually, I'd
>>> like the function to replace the -Wstringop-overflow handler in
>>> builtins.c).  With the change, the note above might read something
>>> like:
>>>
>>> note: at offset 11 to object ‘b’ with size 8 declared here
>>>     45 |   char b[N];
>>>        |        ^
>>>
>>> Tested on x86_64-linux.
>>>
>>> Martin
>>>
>>> gcc-store-offset.diff
>>>
>>> gcc/ChangeLog:
>>>
>>>     * builtins.c (compute_objsize): Add an argument and set it to offset
>>>     into destination.
>>>     * builtins.h (compute_objsize): Add an argument.
>>>     * tree-object-size.c (addr_object_size): Add an argument and set it
>>>     to offset into destination.
>>>     (compute_builtin_object_size): Same.
>>>     * tree-object-size.h (compute_builtin_object_size): Add an argument.
>>>     * tree-ssa-strlen.c (get_addr_stridx): Add an argument and set it
>>>     to offset into destination.
>>>     (maybe_warn_overflow): New function.
>>>     (handle_store): Call maybe_warn_overflow to issue warnings.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>>     * c-c++-common/Wstringop-overflow-2.c: Adjust text of expected
>>> messages.
>>>     * g++.dg/warn/Wstringop-overflow-3.C: Same.
>>>     * gcc.dg/Wstringop-overflow-17.c: Same.
>>>
>>
>>> Index: gcc/tree-ssa-strlen.c
>>> ===================================================================
>>> --- gcc/tree-ssa-strlen.c    (revision 277886)
>>> +++ gcc/tree-ssa-strlen.c    (working copy)
>>> @@ -189,6 +189,52 @@ struct laststmt_struct
>>>   static int get_stridx_plus_constant (strinfo *, unsigned
>>> HOST_WIDE_INT, tree);
>>>   static void handle_builtin_stxncpy (built_in_function,
>>> gimple_stmt_iterator *);
>>>   +/* Sets MINMAX to either the constant value or the range VAL is in
>>> +   and returns true on success.  */
>>> +
>>> +static bool
>>> +get_range (tree val, wide_int minmax[2], const vr_values *rvals = NULL)
>>> +{
>>> +  if (tree_fits_uhwi_p (val))
>>> +    {
>>> +      minmax[0] = minmax[1] = wi::to_wide (val);
>>> +      return true;
>>> +    }
>>> +
>>> +  if (TREE_CODE (val) != SSA_NAME)
>>> +    return false;
>>> +
>>> +  if (rvals)
>>> +    {
>>> +      gimple *def = SSA_NAME_DEF_STMT (val);
>>> +      if (gimple_assign_single_p (def)
>>> +      && gimple_assign_rhs_code (def) == INTEGER_CST)
>>> +    {
>>> +      /* get_value_range returns [0, N] for constant assignments.  */
>>> +      val = gimple_assign_rhs1 (def);
>>> +      minmax[0] = minmax[1] = wi::to_wide (val);
>>> +      return true;
>>> +    }
>> Umm, something seems really off with this hunk.  If the SSA_NAME is set
>> via a simple constant assignment, then the range ought to be a singleton
>> ie [CONST,CONST].   Is there are particular test were this is not true?
>>
>> The only way offhand I could see this happening is if originally the RHS
>> wasn't a constant, but due to optimizations it either simplified into a
>> constant or a constant was propagated into an SSA_NAME appearing on the
>> RHS.  This would have to happen between the last range analysis and the
>> point where you're making this query.
> 
> Yes, I think that's right.  Here's an example where it happens:
> 
>   void f (void)
>   {
>     char s[] = "1234";
>     unsigned n = strlen (s);
>     char vla[n];   // or malloc (n)
>     vla[n] = 0;    // n = [4, 4]
>     ...
>   }
> 
> The strlen call is folded to 4 but that's not propagated to
> the access until sometime after the strlen pass is done.
Hmm.  Are we calling set_range_info in that case?  That goes behind the
back of pass instance of vr_values.  If so, that might argue we want to
be setting it in vr_values too.

> 
> It's mostly a reminder that there may be room for improvement
> here.  Maybe not for ranges of sizes but possibly for ranges
> of offsets (e.g., if an offset's range is the union of
> [-4, -1] and [7, 9] and the destination array is 4 byes big
> the access is invalid).
I'd be surprised if this happens in practice, but maybe I'm missing
something.

> 
> I've been thinking about how to handle multiple ranges when
> the new range info makes them available.  I'm not sure I see
> how it will be possible to retrofit the existing code to make
> use of them.  It seems that even code that tries to put anti-
> ranges to use today will need to change.  It will be a fun
> exercise.
It's certainly going to be interesting.  As you may have heard in the
meeting yesterday, Aldy and Andrew are looking at expanding how many
subranges are in the representation because we're losing data with the
current representation.

Jeff

Reply via email to