On December 15, 2017 4:58:14 PM GMT+01:00, Martin Sebor <[email protected]>
wrote:
>On 12/15/2017 01:48 AM, Richard Biener wrote:
>> On Thu, Dec 14, 2017 at 5:01 PM, Martin Sebor <[email protected]>
>wrote:
>>> On 12/14/2017 03:43 AM, Richard Biener wrote:
>>>>
>>>> On Wed, Dec 13, 2017 at 4:47 AM, Martin Sebor <[email protected]>
>wrote:
>>>>>
>>>>> On 12/12/2017 05:35 PM, Jeff Law wrote:
>>>>>>
>>>>>>
>>>>>> On 12/12/2017 01:15 PM, Martin Sebor wrote:
>>>>>>>
>>>>>>>
>>>>>>> Bug 83373 - False positive reported by -Wstringop-overflow, is
>>>>>>> another example of warning triggered by a missed optimization
>>>>>>> opportunity, this time in the strlen pass. The optimization
>>>>>>> is discussed in pr78450 - strlen(s) return value can be assumed
>>>>>>> to be less than the size of s. The gist of it is that the
>result
>>>>>>> of strlen(array) can be assumed to be less than the size of
>>>>>>> the array (except in the corner case of last struct members).
>>>>>>>
>>>>>>> To avoid the false positive the attached patch adds this
>>>>>>> optimization to the strlen pass. Although the patch passes
>>>>>>> bootstrap and regression tests for all front-ends I'm not sure
>>>>>>> the way it determines the upper bound of the range is 100%
>>>>>>> correct for languages with arrays with a non-zero lower bound.
>>>>>>> Maybe it's just not as tight as it could be.
>>>>>>
>>>>>>
>>>>>> What about something hideous like
>>>>>>
>>>>>> struct fu {
>>>>>> char x1[10];
>>>>>> char x2[10];
>>>>>> int avoid_trailing_array;
>>>>>> }
>>>>>>
>>>>>> Where objects stored in x1 are not null terminated. Are we in
>the realm
>>>>>> of undefined behavior at that point (I hope so)?
>>>>>
>>>>>
>>>>>
>>>>> Yes, this is undefined. Pointer arithmetic (either direct or
>>>>> via standard library functions) is only defined for pointers
>>>>> to the same object or subobject. So even something like
>>>>>
>>>>> memcpy (pfu->x1, pfu->x1 + 10, 10);
>>>>>
>>>>> is undefined.
>>>>
>>>>
>>>> There's nothing undefined here - computing the pointer pointing
>>>> to one-after-the-last element of an array is valid (you are just
>>>> not allowed to dereference it).
>>>
>>>
>>> Right, and memcpy dereferences it, so it's undefined.
>>
>> That's interpretation of the standard that I don't share.
>
>It's not an interpretation. It's a basic rule of the languages
>that the standards are explicit about. In C11 you will find
>this specified in detail in 6.5.6, paragraph 7 and 8 (of
>particular relevance to your question below is p7: "a pointer
>to an object that is not an element of an array behaves the same
>as a pointer to the first element of an array of length one.")
I know.
>> Also, if I have struct f { int i; int j; }; and a int * that points
>> to the j member you say I have no standard conforming way
>> to get at a pointer to the i member from this, right?
>
>Correct. See above.
>
>> Because
>> the pointer points to an 'int' object. But it also points within
>> a struct f object! So at least maybe (int *)((char *)p - offsetof
>> (struct f, j))
>> should be valid?
>
>No, not really. It works in practice but it's not well-defined.
>It doesn't matter how you get at the result. What matters is
>what you start with. As Jeff said, to derive a pointer to
>distinct suobjects of a larger object you need to start with
>a pointer to the larger object and treat it as an array of
>chars.
That's obviously not constraints people use C and C++ with so I see no way to
enforce this within gimple.
>> This means that pfu->x1 + 10 is a valid pointer
>> into *pfu no matter what you say and you can dereference it.
>
>No.
>
>As another hopefully more convincing example consider a multi-
>dimensional array A[2][2]. The value of the offset of A[i][j]
>is sizeof A[i] + j. With that, the offset of A[1][0] is
>sizeof A[1] + 0, and so would be the offset of A[0][2]. But
>that doesn't make A[0][2] a valid reference to an element of
>A (because A[0] has only two elements, A[0][0] and A[0][1]),
>or &A[0] + 2 a derefernceable pointer. It's a pointer that
>points just past the last element of the array A[0]. That
>there's another array right after A[0] (namely A[1]) is
>immaterial, same as in the struct f example above.
I know. Dependence analysis relies on this. We've had bugs in the past with gcc
itself introducing such bogus references.
Richard.
>
>Martin