On 09/18/18 07:31, Jeff Law wrote:
> On 9/17/18 1:18 PM, Bernd Edlinger wrote:
>> On 09/17/18 20:32, Jeff Law wrote:
>>> On 9/17/18 12:20 PM, Bernd Edlinger wrote:
>>>> On 09/17/18 19:33, Jeff Law wrote:
>>>>> On 9/16/18 1:58 PM, Bernd Edlinger wrote:
>>>>>> Hi,
>>>>>>
>>>>>> this is a cleanup of the recently added strlen/strcpy/stpcpy
>>>>>> no nul warning code.
>>>>>>
>>>>>> Most importantly it moves the SSA_NAME handling from
>>>>>> unterminated_array to string_constant, thereby fixing
>>>>>> another round of xfails in the strlen and stpcpy test cases.
>>>>>>
>>>>>> I need to say that the fix for bug 86622 is relying in
>>>>>> type info on the pointer which is probably not safe in
>>>>>> GIMPLE in the light of the recent discussion.
>>>>>>
>>>>>> I had to add two further exceptions, which should
>>>>>> be safe in general: that is a pointer arithmentic on a string
>>>>>> literal is okay, and arithmetic on a string constant
>>>>>> that is exactly the size of the whole DECL, cannot
>>>>>> access an adjacent member.
>>>>>>
>>>>>>
>>>>>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
>>>>>> Is it OK for trunk?
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> Bernd.
>>>>>>
>>>>>>
>>>>>> patch-cleanup-no-nul.diff
>>>>>>
>>>>>> gcc:
>>>>>> 2018-09-16  Bernd Edlinger  <bernd.edlin...@hotmail.de>
>>>>>>
>>>>>>  * builtins.h (unterminated_array): Remove prototype.
>>>>>>            * builtins.c (unterminated_array): Simplify.  Make static.
>>>>>>            (expand_builtin_strcpy_args): Don't use TREE_NO_WARNING here.
>>>>>>  (expand_builtin_stpcpy_1): Remove warn_string_no_nul without effect.
>>>>>>  * expr.c (string_constant): Handle SSA_NAME.  Add more exceptions
>>>>>>  where pointer arithmetic is safe.
>>>>>>  * gimple-fold.c (get_range_strlen): Handle nonstr like in c_strlen.
>>>>>>  (get_max_strlen): Remove the unnecessary mynonstr handling.
>>>>>>  (gimple_fold_builtin_strcpy): Simplify.
>>>>>>  (gimple_fold_builtin_stpcpy): Simplify.
>>>>>>  (gimple_fold_builtin_sprintf): Remove NO_WARNING propagation
>>>>>>  without effect.
>>>>>>  (gimple_fold_builtin_strlen): Simplify.
>>>>>>
>>>>>> testsuite:
>>>>>> 2018-09-16  Bernd Edlinger  <bernd.edlin...@hotmail.de>
>>>>>>
>>>>>>  * gcc.dg/warn-stpcpy-no-nul.c: Remove xfails.
>>>>>>  * gcc.dg/warn-strlen-no-nul.c: Remove xfails.
>>>>>>
>>>>>> Index: gcc/builtins.c
>>>>>> ===================================================================
>>>>>> --- gcc/builtins.c       (revision 264342)
>>>>>> +++ gcc/builtins.c       (working copy)
>>>>>> @@ -567,31 +567,12 @@ warn_string_no_nul (location_t loc, const char *fn
>>>>>>        the declaration of the object of which the array is a member or
>>>>>>        element.  Otherwise return null.  */
>>>>>>     
>>>>>> -tree
>>>>>> +static tree
>>>>>>     unterminated_array (tree exp)
>>>>>>     {
>>>>>> -  if (TREE_CODE (exp) == SSA_NAME)
>>>>>> -    {
>>>>>> -      gimple *stmt = SSA_NAME_DEF_STMT (exp);
>>>>>> -      if (!is_gimple_assign (stmt))
>>>>>> -        return NULL_TREE;
>>>>>> -
>>>>>> -      tree rhs1 = gimple_assign_rhs1 (stmt);
>>>>>> -      tree_code code = gimple_assign_rhs_code (stmt);
>>>>>> -      if (code == ADDR_EXPR
>>>>>> -          && TREE_CODE (TREE_OPERAND (rhs1, 0)) == ARRAY_REF)
>>>>>> -        rhs1 = rhs1;
>>>>>> -      else if (code != POINTER_PLUS_EXPR)
>>>>>> -        return NULL_TREE;
>>>>>> -
>>>>>> -      exp = rhs1;
>>>>>> -    }
>>>>>> -
>>>>>>       tree nonstr = NULL;
>>>>>> -  if (c_strlen (exp, 1, &nonstr, 1) == NULL && nonstr)
>>>>>> -    return nonstr;
>>>>>> -
>>>>>> -  return NULL_TREE;
>>>>>> +  c_strlen (exp, 1, &nonstr);
>>>>>> +  return nonstr;
>>>>>>     }
>>>>> Sigh.  This is going to conflict with patch #6 from Martin's kit.
>>>>>
>>>>
>>>> Sorry, it just feels wrong to do this folding here instead of in
>>>> string_constant.
>>>>
>>>> I think the handling of POINTER_PLUS_EXPR above, is faulty,
>>>> because it is ignoring the offset parameter, which typically
>>>> contains the offset part, may add an offset to a different
>>>> structure member or another array index.  That is what happened
>>>> in PR 86622.
>>>>
>>>> So you are likely looking at the wrong index, or even the wrong
>>>> structure member.
>>> I'm not disagreeing that something's wrong here -- the whole concept
>>> that we extract the rhs and totally ignore the offset seems wrong.  I've
>>> stumbled over it working through issues with either patch #4 or #6 from
>>> Martin.  But I felt I needed to go back and reevaluate any assumptions I
>>> had about how the code was supposed to be used before I called it out.
>>>
>>>
>>> Your expr.c changes may be worth pushing through independently of the
>>> rest.    AFAICT they're really just exposing more cases where we can
>>> determine that we're working with a stirng constant.
>>>
>>
>> I think that moves just the folding from here to expr.c,
>> I don't see how the change in part #6 on unterminated_array
>> is supposed to work, I quote it here and add some comments:
>>
> Essentially there's a couple of concepts he wants to get in
> unterminated_array.
> 
> First, given an address, if there's a variable part we want to clear
> exact so we can adjust the text of the message in the caller.  ie
> "exceeds the size" vs "may exceed the size".
> 
> Second, he really wants LEN to be set to the length of the string for
> whatever the constant part of the address might be, even if it's
> potentially not properly terminated.
> 
> So given:
> 
> const char b[][5] = { /* { dg-message "declared here" } */
>    "12", "345", "6789", "abcde"
> };
> 
> A reference like:
> 
> T (&b[3][1] + v0, bsz);
> 
> We want the length of the string at &b[3][1] into *LEN.  That's used to
> provide the potential length of the string in the message.
> 
> I don't think we can get that from c_strlen right now.  Right now we've
> defined c_strlen as (reasonably) returning NULL for the length when
> we've got an string without proper termination.
> 
> I'm hesitant to provide a boolean argument that essentially says give me
> whatever length you've computed, even if the string isn't properly
> terminated.   But conceptually that's one of the things we'd need.  It's
> also needed for patch #4.
> 
> Jeff
> 
> ps.  Yes there's still some issues in the code.  I'm mostly trying to
> highlight how it's being used and what requirements we'd have if we were
> going to do something like you've suggested in your patch to
> unterminated_array.
> 

Hmm, you know, I wrote a while ago the following:
https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00411.html

Where I suggested to change c_strlen parameter nonstr to a
structure, where additional information could go.

There are a lot more informations that could be relevant
for warnings.

So it would be easy to convey the unterminated string length
there, also I could imagine to diagnose unaligned wide string
constants, if we have a decl with a too low alignment factor,
or a known unaligned offset for instance.  Also the location
of the expression on which c_strlen found the unterminated
string, could be of interest, especially when get_range_strlen
follows phi nodes.

Well, I have the impression that the 6/6 patch should be split
up in a part that uses c_strlen in its current form, and a
follow-up patch that adds any additional info.
I prefer doing things step-by-step, you know.

Note that the example given is probably unsafe:
> T (&b[3][1] + v0, bsz);

This does not pass the check in string_constant because it
looks like _2 = (char*)&b + _1; _3 = strlen (_2);
I have no way to prove that v0 is not larger than sizeof (b[3]).

However T(&a[1] + v0, bsz) will definitely work.

So the warning will likely get completely bogus data out
of unterminated_array.

By the way test test case for pr87053 is an example
of a false positive warning (it comes twice even though
TREE_NO_WARNING is used).

gcc pr87053.c
pr87053.c: In function 'main':
pr87053.c:15:26: warning: 'strlen' argument missing terminating nul 
[-Wstringop-overflow=]
15 |   if (__builtin_strlen (u.z) != 7)
    |                         ~^~
pr87053.c:11:3: note: referenced argument declared here
11 | } u = {{"1234", "567"}};
    |   ^
pr87053.c:15:26: warning: 'strlen' argument missing terminating nul 
[-Wstringop-overflow=]
15 |   if (__builtin_strlen (u.z) != 7)
    |                         ~^~
pr87053.c:11:3: note: referenced argument declared here
11 | } u = {{"1234", "567"}};
    |   ^

I think the duplicate warnings will not go away unless
this warning is only diagnosed in the tree-ssa-strlen
pass instead of the folding code, which happens repeatedly.



Bernd.

Reply via email to