NagyDonat wrote: Thanks for all the valuable feedback!
> We should separate out `State` from `CallEvent` or have some layer that would > ensure that the State attached with a Call is always up to date. @steakhal In an ideal world I agree with this, but I fear that both suggestions have difficulties: - There are lots of methods that rely on the `State` within the `CallEvent` – even seemingly innocent methods like `SimpleFunctionCall::getDecl()` can depend on the state (when a function is called through a pointer). Before checking this I thought that it would be nice to separate out `State` from the `CallEvent`, but based on this I fear that this wouldn't be practical. - Ensuring that the Call always refers to the most recent state is practically impossible, because the "most recent state" is only tracked informally: the developer knows which state variable is the most recent, but this is not represented in a machine-readable way. (In some functional languages there are so called "uniqueness types" or the [`State` monad](https://hackage.haskell.org/package/mtl-2.3.1/docs/Control-Monad-State-Lazy.html) which can express invariants like "always use the most recent state" but C++ doesn't provide a suitable abstraction for this goal.) Based on this right now I feel that the best approach would be gradually improving the situation by smaller changes. > > I will create a test with a new debug checker which modifies the state in > > its `PreCall` callback and validates that the modification is visible > > through the `CallEvent` objects received by the `EvalCall` and > > `PointerEscape` callbacks. Is this sufficient, or would you prefer a > > different kind of test? > > I think the best way to go about this is a unittest with a custom bug report > visitor that prints a special trait along the path. Like a logger. And the > visitor would print these values alongside the ProgramPoint it is attached to. I don't understand how could a bug reporter visitor help in this situation. The bug (which is fixed by this PR) is that checker callbacks receive `CallEvent` objects whose attached state is old (and differs from the state available through the `CheckerContext` which is also passed to the same callback). Testing this requires two components: - Checker callbacks that validate that the `CallEvent` and the `CheckerContext` have (references to) the same state. - Checker callbacks that modify the state (at suitable points e.g. the PreCall callback) to ensure that states that _may_ be different are actually different. It wouldn't be too difficult to find "real" checkers that cause suitable state changes, but to make the test simple and self-contained I'm leaning towards a "synthetic" state change from a debug checker (the same one that also does the validation). A bug report visitor doesn't help with either of these components, because it cannot observe the state attached to the `CallEvent` (it is just passed to the checker callback, but not persisted in the `ExplodedGraph` for later use by e.g. visitors) and it (obviously) cannot update the state. > From this perspective, yeah, it may be slightly useful to keep the original > pre-call state around. Say, if you're modeling a function `void foo(int > *arg)` that effectively does something like `if (*arg == 0) *arg = 1;`, and > your checker callback is sufficiently complex, then at some point you may > need to ask "Uhh can you please remind me what the original value was?" - and > then `CallEvent` comes in helpful. But I don't think this minor benefit > outweighs the confusion caused by calling that very specific program state > "the state". I think it's much easier to handle that by deliberately > remembering the original state in a local variable, and then pass that state > around. It's probably not a popular situation in the first place. I didn't even think about this usecase of the "archived" `State` attached to the call event. I agree that this could be useful within checker code, and I also agree that this is probably a rare situation and should be handled by explicitly saving the original state in a local variable. (Note: this "state attached to call becomes obsolete within the checker" situation is distinct from the "checker callback receives a CallEvent with a state that is already obsolete at the beginning of the call" bug which is fixed by this PR.) > First, yeah, like you said, we could have a static or dynamical typestate > that indicates whether `getReturnValue()` makes sense. (It only ever makes > sense in `checkPostCall`.) I support this idea -- writing a dynamic check would be trivial, and even a static check is not too difficult. > Then, it might be a good idea to simply make `CallEvent.getState()` > _private_. It wasn't our goal to provide convenient access to the entire > state. We just wanted a nice `getArgSVal()` and such. Maybe let's provide > only the actually-useful, non-confusing methods? Making `CallEvent.getState()` private would be a very easy "harm prevention" change that would reduce the damage potential of this footgun. A quick search reveals that outside of `CallEvent.{cpp,h}` this method is only used twice, and in both of those locations it would be very simple to get the state from the readily available `CheckerContext` instead. (This is very fortunate -- I would have guessed that there is more of this...) In fact I'll soon create a trivial separate PR for this proposal, because it's a trivial change and I don't see any downside. Unfortunately this is just "sweeping the problems under the rug" and not a complete solution -- even if we cannot directly access the obsolete old state object, the analyzer may still run into logically wrong behavior if methods like `getArgSVal` produce obsolete results from the old state. There are probably situations where the old state is not _too_ old and the values that are looked up from it happen to be correct (e.g. a PreCall callback can't rebind the value of an expression in the `Environment`), but reasoning about the partial correctness of old states is not a safe long-term solution. > Alternatively, it may be a good idea to implement these methods in > `CheckerContext` itself, and then stop showing `CallEvent` to the user. So > instead of `CallEvent.getArgSVal()` you'd need to type > `CheckerContext.getCallArgSVal()`. Unfortunately this disconnects the user > from the entire polymorphism of the `CallEvent` class. There's actually a lot > of methods that would need to be reimplemented (like > `AnyCXXConstructorCall.getCXXThisVal()` or > `CXXAllocatorCall.getArraySizeVal()) and most of them wouldn't make sense > most of the time. WDYT about a change that keeps the `CallEvent` type hierarchy but ensures that `CallEvent`s can only be constructed by the `CheckerContext`? I can envision a situation where: - There are "empty"/"stateless"/"template" `CallEvent` objects that are passed around in the engine e.g. in `ExprEngine::evalCall`. - The constructor of `CheckerContext` can (optionally) own a `CallEvent` (of this empty stateless kind). - Checkers can use a new method called `CheckerContext::getCallEvent()` which combines the stateless call event with the state of the `CheckerContext` to return a "regular" `CallEvent`. > I think the assertion I propose is the easiest way to test this patch. I.e., > the fact that the newly added assertion used to crash without the rest of the > patch but doesn't crash anymore, is a sufficient defense against regressions > in and of itself. @haoNoQ Are you speaking about the assertion to validate that `CheckerContext.getState() == CallEvent.getState()`? I agree that this assertion enforces an invariant that should be held, but unfortunately I don't see a "natural place" for it. - Placing this on the engine side of the engine/checker boundary (just before calling the callback) seems natural, but that's very close to the location where the `CheckerContext` and the (updated) `CallEvent` are created -- so we would just write a tautologically passing check that almost looks like `foo = bar; assert(foo == bar);`. - Placing this assertion into every checker callback (that takes a `CallEvent`) would be too much "pollution" IMO. - Finally, we could place this assertion in a debug checker that is designed for this purpose and activated in a test that ensures that its callbacks are triggered (with frequent state changes to reveal the use of obsolete states). This is essentially equivalent to the testing approach that I suggested. https://github.com/llvm/llvm-project/pull/160707 _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
