On Thu, Oct 19, 2017 at 1:19 AM, Martin Sebor <[email protected]> wrote:
> On 10/18/2017 04:48 AM, Richard Biener wrote:
>>
>> On Wed, Oct 18, 2017 at 5:34 AM, Martin Sebor <[email protected]> wrote:
>>>
>>> While testing my latest -Wrestrict changes I noticed a number of
>>> opportunities to improve the -Warray-bounds warning. Attached
>>> is a patch that implements a solution for the following subset
>>> of these:
>>>
>>> PR tree-optimization/82596 - missing -Warray-bounds on an out-of
>>> bounds index into string literal
>>> PR tree-optimization/82588 - missing -Warray-bounds on an excessively
>>> large index
>>> PR tree-optimization/82583 - missing -Warray-bounds on out-of-bounds
>>> inner indices
>>>
>>> The patch also adds more detail to the -Warray-bounds diagnostics
>>> to make it easier to see the cause of the problem.
>>>
>>> Richard, since the changes touch tree-vrp.c, I look in particular
>>> for your comments.
>>
>>
>> + /* Accesses to trailing arrays via pointers may access storage
>> + beyond the types array bounds. For such arrays, or for flexible
>> + array members as well as for other arrays of an unknown size,
>> + replace the upper bound with a more permissive one that assumes
>> + the size of the largest object is SSIZE_MAX. */
>>
>> I believe handling those cases are somewhat academic, but ...
>
>
> Thanks for the quick review!
>
> I agree the SSIZE_MAX tests handle corner cases and there are
> arguably more important gaps here to plug (e.g., VLAs). Then
> again, most bugs tend to lurk in corner cases of one kind or
> another and these seemed like a good way for me to come up to
> speed on the implementation before tackling those. If you
> have suggestions for which to dive into next I'm happy to take
> them.
>
>> + tree eltype = TREE_TYPE (ref);
>> + tree eltsize = TYPE_SIZE_UNIT (eltype);
>>
>> needs to use array_ref_element_size. Note that this size can be
>> non-constant in which case you now ICE so you have to check
>> this is an INTEGER_CST.
>
>
> Thanks. I did reproduce a few ICEs due to VLAs. I've fixed
> the problems and added tests for them. One-dimensional VLAs
> are now handled the same way arrays of unknown bound are.
> Handling them more intelligently (i.e., taking into account
> the ranges their bounds are in) and especially handling
> multidimensional VLAs will take some effort.
>
>>
>> + tree maxbound = TYPE_MAX_VALUE (ssizetype);
>>
>> please don't use ssizetype - sizetype can be of different precision
>> than pointers (and thus objects). size_type_node would come
>> close (maps to size_t), eventually ptrdiff_type_node is also
>> defined by all frontends.
>
>
> Okay, I've changed it to sizetype (it would be nice if there
> were a cleaner way of doing it rather than by bit-twiddling.)
>
> ptrdiff_type would have been my first choice but it's only defined
> by the C/C++ front ends and not available in the middle-end, so
> the other warnings that consider object sizes deal with ssizetype
> (e.g., -Walloc-size-larger-than, -Wstringop- overflow, and
> -Wformat-overflow).
>
> That said, I'm not sure I understand under what conditions
> ssizetype is not the signed equivalent of sizetype. Can you
> clarify?
I meant to use size_type_node (size_t), not sizetype. But
I just checked that ptrdiff_type_node is initialized in
build_common_tree_nodes and thus always available.
> As an aside, at some point I would like to get away from a type
> based limit in all these warnings and instead use one that can
> be controlled by an option so that a user can impose a lower limit
> on the maximum size of an object and have all size-related warnings
> (and perhaps even optimizations) enforce it and benefit from it.
You could add a --param that is initialized from ptrdiff_type_node.
>> + up_bound_p1 = fold_build2 (TRUNC_DIV_EXPR, ssizetype, maxbound,
>> eltsize);
>> +
>>
>> int_const_binop if you insist on using trees...
>
>
> Sure. (I think offset_int would be more convenient but the rest
> of the function uses trees so I just stuck to those to avoid
> converting things back and forth or disrupting more of the code
> than I had to.)
Understood.
>>
>> + tree arg = TREE_OPERAND (ref, 0);
>> + tree_code code = TREE_CODE (arg);
>> + if (code == COMPONENT_REF)
>> + {
>> + HOST_WIDE_INT off;
>> + if (tree base = get_addr_base_and_unit_offset (ref, &off))
>> + up_bound_p1 = fold_build2 (MINUS_EXPR, ssizetype, up_bound_p1,
>> + TYPE_SIZE_UNIT (TREE_TYPE (base)));
>> + else
>> + return;
>>
>> so this gives up on a.b[i].c.d[k] (ok, array_at_struct_end_p will be
>> false).
>> simply not subtracting anyhing instead of returning would be
>> conservatively
>> correct, no? Likewise subtracting the offset of the array for all
>> "previous"
>> variably indexed components with assuming the lowest value for the index.
>> But as above I think compensating for the offset of the array within the
>> object
>> is academic ... ;)
>
>
> I was going to say yes (it gives up) but on second thought I don't
> think it does. Only the major index can be unbounded and the code
> does consider the size of the sub-array when checking the major
> index. So, IIUC, I think this works correctly as is (*). What
> doesn't work is VLAs but those are a separate problem. Let me
> know if I misunderstood your question.
get_addr_base_and_unit_offset will return NULL if there's any variable
component in 'ref'. So as written it seems to be dead code (you
want to pass 'arg'?)
I was asking you to remove the 'else return' because w/o subtracting
the upper bound is just more conservative.
>
>> + else if (code == STRING_CST)
>> + up_bound_p1 = build_int_cst (ssizetype, TREE_STRING_LENGTH (arg));
>>
>> that one is more interesting -- why's the TYPE_DOMAIN of the STRING_CST
>> lacking
>> a max value? Iff we use build_string_literal it should have the proper
>> type.
>
>
> Good question! STRING_CST does have a domain. The problem is
> that array_at_struct_end_p() returns true for STRING_CST. I've
> added the handling to the function and removed the block above
> from the latest patch.
Can you split out the STRING_CST handling and commit that separately
(split the testcase)? That part looks ok.
Thanks,
Richard.
> Martin
>
> [*] This is diagnosed:
>
> struct C { char d[4]; };
> struct B { struct C c; };
> struct A { struct B b[7]; };
>
> int f (struct A a, unsigned i, unsigned k)
> {
> if (i < 7) i = 7;
> if (k < 4) k = 4;
>
> return a.b[i].c.d[k];
> }
>
> warning: array subscript 4 is above array bounds of ‘char[4]’
> [-Warray-bounds]
> return a.b[i].c.d[k];
> ~~~~~~~~~~^~~
> warning: array subscript 7 is above array bounds of ‘struct B[7]’
> [-Warray-bounds]
> return a.b[i].c.d[k];
> ~~~^~~
>