On 09/18/18 07:31, Jeff Law wrote: > On 9/17/18 1:18 PM, Bernd Edlinger wrote: >> On 09/17/18 20:32, Jeff Law wrote: >>> On 9/17/18 12:20 PM, Bernd Edlinger wrote: >>>> On 09/17/18 19:33, Jeff Law wrote: >>>>> On 9/16/18 1:58 PM, Bernd Edlinger wrote: >>>>>> Hi, >>>>>> >>>>>> this is a cleanup of the recently added strlen/strcpy/stpcpy >>>>>> no nul warning code. >>>>>> >>>>>> Most importantly it moves the SSA_NAME handling from >>>>>> unterminated_array to string_constant, thereby fixing >>>>>> another round of xfails in the strlen and stpcpy test cases. >>>>>> >>>>>> I need to say that the fix for bug 86622 is relying in >>>>>> type info on the pointer which is probably not safe in >>>>>> GIMPLE in the light of the recent discussion. >>>>>> >>>>>> I had to add two further exceptions, which should >>>>>> be safe in general: that is a pointer arithmentic on a string >>>>>> literal is okay, and arithmetic on a string constant >>>>>> that is exactly the size of the whole DECL, cannot >>>>>> access an adjacent member. >>>>>> >>>>>> >>>>>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu. >>>>>> Is it OK for trunk? >>>>>> >>>>>> >>>>>> Thanks >>>>>> Bernd. >>>>>> >>>>>> >>>>>> patch-cleanup-no-nul.diff >>>>>> >>>>>> gcc: >>>>>> 2018-09-16 Bernd Edlinger <bernd.edlin...@hotmail.de> >>>>>> >>>>>> * builtins.h (unterminated_array): Remove prototype. >>>>>> * builtins.c (unterminated_array): Simplify. Make static. >>>>>> (expand_builtin_strcpy_args): Don't use TREE_NO_WARNING here. >>>>>> (expand_builtin_stpcpy_1): Remove warn_string_no_nul without effect. >>>>>> * expr.c (string_constant): Handle SSA_NAME. Add more exceptions >>>>>> where pointer arithmetic is safe. >>>>>> * gimple-fold.c (get_range_strlen): Handle nonstr like in c_strlen. >>>>>> (get_max_strlen): Remove the unnecessary mynonstr handling. >>>>>> (gimple_fold_builtin_strcpy): Simplify. >>>>>> (gimple_fold_builtin_stpcpy): Simplify. >>>>>> (gimple_fold_builtin_sprintf): Remove NO_WARNING propagation >>>>>> without effect. >>>>>> (gimple_fold_builtin_strlen): Simplify. >>>>>> >>>>>> testsuite: >>>>>> 2018-09-16 Bernd Edlinger <bernd.edlin...@hotmail.de> >>>>>> >>>>>> * gcc.dg/warn-stpcpy-no-nul.c: Remove xfails. >>>>>> * gcc.dg/warn-strlen-no-nul.c: Remove xfails. >>>>>> >>>>>> Index: gcc/builtins.c >>>>>> =================================================================== >>>>>> --- gcc/builtins.c (revision 264342) >>>>>> +++ gcc/builtins.c (working copy) >>>>>> @@ -567,31 +567,12 @@ warn_string_no_nul (location_t loc, const char *fn >>>>>> the declaration of the object of which the array is a member or >>>>>> element. Otherwise return null. */ >>>>>> >>>>>> -tree >>>>>> +static tree >>>>>> unterminated_array (tree exp) >>>>>> { >>>>>> - if (TREE_CODE (exp) == SSA_NAME) >>>>>> - { >>>>>> - gimple *stmt = SSA_NAME_DEF_STMT (exp); >>>>>> - if (!is_gimple_assign (stmt)) >>>>>> - return NULL_TREE; >>>>>> - >>>>>> - tree rhs1 = gimple_assign_rhs1 (stmt); >>>>>> - tree_code code = gimple_assign_rhs_code (stmt); >>>>>> - if (code == ADDR_EXPR >>>>>> - && TREE_CODE (TREE_OPERAND (rhs1, 0)) == ARRAY_REF) >>>>>> - rhs1 = rhs1; >>>>>> - else if (code != POINTER_PLUS_EXPR) >>>>>> - return NULL_TREE; >>>>>> - >>>>>> - exp = rhs1; >>>>>> - } >>>>>> - >>>>>> tree nonstr = NULL; >>>>>> - if (c_strlen (exp, 1, &nonstr, 1) == NULL && nonstr) >>>>>> - return nonstr; >>>>>> - >>>>>> - return NULL_TREE; >>>>>> + c_strlen (exp, 1, &nonstr); >>>>>> + return nonstr; >>>>>> } >>>>> Sigh. This is going to conflict with patch #6 from Martin's kit. >>>>> >>>> >>>> Sorry, it just feels wrong to do this folding here instead of in >>>> string_constant. >>>> >>>> I think the handling of POINTER_PLUS_EXPR above, is faulty, >>>> because it is ignoring the offset parameter, which typically >>>> contains the offset part, may add an offset to a different >>>> structure member or another array index. That is what happened >>>> in PR 86622. >>>> >>>> So you are likely looking at the wrong index, or even the wrong >>>> structure member. >>> I'm not disagreeing that something's wrong here -- the whole concept >>> that we extract the rhs and totally ignore the offset seems wrong. I've >>> stumbled over it working through issues with either patch #4 or #6 from >>> Martin. But I felt I needed to go back and reevaluate any assumptions I >>> had about how the code was supposed to be used before I called it out. >>> >>> >>> Your expr.c changes may be worth pushing through independently of the >>> rest. AFAICT they're really just exposing more cases where we can >>> determine that we're working with a stirng constant. >>> >> >> I think that moves just the folding from here to expr.c, >> I don't see how the change in part #6 on unterminated_array >> is supposed to work, I quote it here and add some comments: >> > Essentially there's a couple of concepts he wants to get in > unterminated_array. > > First, given an address, if there's a variable part we want to clear > exact so we can adjust the text of the message in the caller. ie > "exceeds the size" vs "may exceed the size". > > Second, he really wants LEN to be set to the length of the string for > whatever the constant part of the address might be, even if it's > potentially not properly terminated. > > So given: > > const char b[][5] = { /* { dg-message "declared here" } */ > "12", "345", "6789", "abcde" > }; > > A reference like: > > T (&b[3][1] + v0, bsz); > > We want the length of the string at &b[3][1] into *LEN. That's used to > provide the potential length of the string in the message. > > I don't think we can get that from c_strlen right now. Right now we've > defined c_strlen as (reasonably) returning NULL for the length when > we've got an string without proper termination. > > I'm hesitant to provide a boolean argument that essentially says give me > whatever length you've computed, even if the string isn't properly > terminated. But conceptually that's one of the things we'd need. It's > also needed for patch #4. > > Jeff > > ps. Yes there's still some issues in the code. I'm mostly trying to > highlight how it's being used and what requirements we'd have if we were > going to do something like you've suggested in your patch to > unterminated_array. >
Hmm, you know, I wrote a while ago the following: https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00411.html Where I suggested to change c_strlen parameter nonstr to a structure, where additional information could go. There are a lot more informations that could be relevant for warnings. So it would be easy to convey the unterminated string length there, also I could imagine to diagnose unaligned wide string constants, if we have a decl with a too low alignment factor, or a known unaligned offset for instance. Also the location of the expression on which c_strlen found the unterminated string, could be of interest, especially when get_range_strlen follows phi nodes. Well, I have the impression that the 6/6 patch should be split up in a part that uses c_strlen in its current form, and a follow-up patch that adds any additional info. I prefer doing things step-by-step, you know. Note that the example given is probably unsafe: > T (&b[3][1] + v0, bsz); This does not pass the check in string_constant because it looks like _2 = (char*)&b + _1; _3 = strlen (_2); I have no way to prove that v0 is not larger than sizeof (b[3]). However T(&a[1] + v0, bsz) will definitely work. So the warning will likely get completely bogus data out of unterminated_array. By the way test test case for pr87053 is an example of a false positive warning (it comes twice even though TREE_NO_WARNING is used). gcc pr87053.c pr87053.c: In function 'main': pr87053.c:15:26: warning: 'strlen' argument missing terminating nul [-Wstringop-overflow=] 15 | if (__builtin_strlen (u.z) != 7) | ~^~ pr87053.c:11:3: note: referenced argument declared here 11 | } u = {{"1234", "567"}}; | ^ pr87053.c:15:26: warning: 'strlen' argument missing terminating nul [-Wstringop-overflow=] 15 | if (__builtin_strlen (u.z) != 7) | ~^~ pr87053.c:11:3: note: referenced argument declared here 11 | } u = {{"1234", "567"}}; | ^ I think the duplicate warnings will not go away unless this warning is only diagnosed in the tree-ssa-strlen pass instead of the folding code, which happens repeatedly. Bernd.