DonatNagyE wrote:

> Note that &array[idx] is perfectly valid code when `idx == number of 
> elements`. And it is relatively common to do that when one is using STL 
> algorithms on arrays:
> 
> ```
> auto it = std::find(&array[0], &array[size], foo);
> ```
>
> Of course, one could use the `begin/end` free functions, but those are only 
> available since C++11.

Oh, dammit, this language is _stupid_... I couldn't imagine a reason to use 
`&array[size]` instead of the cleaner `(array + size)`; but _of course_ STL 
containers need `&array[size]` with their overloaded `operator[]` (note: this 
is not an `ArraySubscriptExpr` but an overloaded operator call, so it's not 
handled by this checker) and then (if you say so) some developers will use this 
for plain arrays as well...

> Could you elaborate on alternative approaches you considered fixing the 
> problem and why you chose this one? E.g., would trying to look at the parent 
> regions for expressions like `foo[idx].bar` work? Or is the source of the 
> problem that you'd also need the exact expression for the subscript instead 
> of the `MemberExpr`?

I considered two approaches, this one and walking on the parent region layers 
(I don't think that there is a third distinct approach). I chose this because:
- This way the implementation is easier to understand and more "connected" to 
the concrete details of the analyzed code.
- A few months ago @steakhal realized that `check::Location` is not sufficient 
on its own and it should've been supplemented with a `check::Bind` callback to 
cover all relevant cases. The commit implementing this 
(https://reviews.llvm.org/D159106) was abandoned, but if we keep 
`check::Location`, we would need to reintroduce something like that as well.
- I _like_ that I have the exact expression for the subscript (or other 
dereferencing expression, like `*ptr` or `ptr->foo`) because this puts the 
warning message to the right source location (e.g. in a convoluted multiline 
expressions with several `[]` and `->` layers) and allows/could allow better 
customization of the message. 

> Alternatively, would it be possible to suppress warnings on the common 
> pattern `&array[idx]` by checking the parent of the subscript expression in 
> the AST (but still emitting a warning when the pointer is dereferenced)?

Yes, that was my plan for resolving this issue; and as you say that it's 
needed, I'll implement it. I'll squeeze the parent lookup into the `if` where 
we know that there's an overflow, so that (AFAIK expensive) operation won't run 
in the common case when the array access is in bounds. The "still emitting a 
warning when the pointer is dereferenced" will happen automatically as at that 
point we'll have the same overflowing pointer but without the `&` that negates 
the result. 

I think that I won't suppress the warning in situations like
```
int array[10];
int *f(int arg) {
  if (arg >= 10)
    return &array[arg];
  return array;
}
```
where the analyzer knows that `idx >= size` and doesn't know that `idx == 
size`. Are there any crazy situations where this is also legitimate?

https://github.com/llvm/llvm-project/pull/72107
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to