NagyDonat wrote:

@alejandro-alvarez-sonarsource Thanks for the detailed answers and 
clarifications! :smile:

> I tend to be conservative about existing behavior because
> 
>     1. It is working!
> 
>     2. We have been relying on it for long, and even a small divergence could 
> turn into hundreds, if not thousands, of new positives and negatives coming 
> and going on existing projects (changes that unfortunately are not caught by 
> lit tests)
> 
>     3. Not to mention a slight change in assumptions can cascade to other CSA 
> checkers (both up and downstream) adding and/or removing another big set of 
> raised issues.
> 
> 
> Of course this is not news for you!, but this is why I have preferred to keep 
> the surface of the changes small (at the cost of duplication).

Understood, and I agree in general. However, note that `ArrayBound` is a new 
checker (I brought it out of alpha this year), so I feel that it may be a bit 
more malleable than the "ancient" checkers like e.g. `CallAndMessage`.

-----

> > * Perhaps we don't need the complex old logic that handles multidimensional 
> > arrays by calculating the offset from the beginning of the whole array (by 
> > combining multiple `ElementRegion`s) -- IIUC that logic was introduced when 
> > this checker was still using `check::Location` and `check::Bind` callbacks 
> > that provided less information than the current callbacks.
> > * Perhaps we could switch to using indices instead of byte offsets by 
> > default -- because this would produce simpler symbolic expressions that 
> > would be "more useful" for the rest of the analyzer (e.g. instead of 
> > assuming `4 * x < 40` we could assume `x < 10` which can also occur as e.g. 
> > an `if` condition).
> 
> Can we? How about cases such as this lit test?
> 
> ```c++
> struct two_bytes convertedArray2(void) {
>   // We report this with byte offsets because the offset is not divisible by 
> the element size.
>   struct two_bytes a = {0, 0};
>   char *p = (char*)&a;
>   return *((struct two_bytes*)(p + 7));
>   // expected-warning@-1 {{Out of bound access to memory after the end of 
> 'a'}}
>   // expected-note@-2 {{Access of 'a' at byte offset 7, while it holds only 2 
> bytes}}
> }
> ```

Hmm, we will probably need byte offsets as a fallback option if we need to 
cover cases like this. (Which is not an absolute requirement IMO -- we can 
accept not reporting a few rare corner cases if it is compensated by better 
behavior in the common case.)

> [...] It's quite a coincidence (or perhaps not!). I am also working in 
> parallel in `PointerArithChecker` which is turning out to be very similar too 
> (except it raises the moment the pointer is formed, and that 
> one-after-the-end is still ok, as long as it is not derefenced). So indeed a 
> "core bound checker" library sounds very interesting.

It is nice to hear that you are progressing with `PointerArith` :smile:

As my next goal in this area I will try to design the interface of the "core 
bound checker" library, which should be general enough to cover all the use 
cases. I'll try to post a discourse post about this within a few days.

> > In fact, I started to work on this change two days ago, but I'll probably 
> > need to restart that work because this PR affects lots of code within 
> > `ArrayBoundChecker.cpp`.
> 
> Sorry 😅 I am thinking, maybe you go ahead with your work on those changes, 
> since you have already progressed. [...]

No problem :smile: collisions are natural in open source development.

Now that this discussion highlighted the question of introducing a library-like 
interface for bounds checking, I will probably work on that and discard my 
unfinished change (which would complicate the `performCheck` and make the 
separation of the library interface more difficult).

https://github.com/llvm/llvm-project/pull/159357
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to