NagyDonat wrote: > Thanks for all the valuable feedback! > > > We should separate out `State` from `CallEvent` or have some layer that > > would ensure that the State attached with a Call is always up to date. > > @steakhal In an ideal world I agree with this, but I fear that both > suggestions have difficulties: > > * There are lots of methods that rely on the `State` within the > `CallEvent` – even seemingly innocent methods like > `SimpleFunctionCall::getDecl()` can depend on the state (when a function is > called through a pointer). Before checking this I thought that it would be > nice to separate out `State` from the `CallEvent`, but based on this I fear > that this wouldn't be practical. > > * Ensuring that the Call always refers to the most recent state is > practically impossible, because the "most recent state" is only tracked > informally: the developer knows which state variable is the most recent, but > this is not represented in a machine-readable way. (In some functional > languages there are so called "uniqueness types" or the [`State` > monad](https://hackage.haskell.org/package/mtl-2.3.1/docs/Control-Monad-State-Lazy.html) > which can express invariants like "always use the most recent state" but C++ > doesn't provide a suitable abstraction for this goal.) > > > Based on this right now I feel that the best approach would be gradually > improving the situation by smaller changes. > > > > I will create a test with a new debug checker which modifies the state in > > > its `PreCall` callback and validates that the modification is visible > > > through the `CallEvent` objects received by the `EvalCall` and > > > `PointerEscape` callbacks. Is this sufficient, or would you prefer a > > > different kind of test? > > > > > > I think the best way to go about this is a unittest with a custom bug > > report visitor that prints a special trait along the path. Like a logger. > > And the visitor would print these values alongside the ProgramPoint it is > > attached to. > > I don't understand how could a bug reporter visitor help in this situation. > The bug (which is fixed by this PR) is that checker callbacks receive > `CallEvent` objects whose attached state is old (and differs from the state > available through the `CheckerContext` which is also passed to the same > callback). Testing this requires two components: > > * Checker callbacks that validate that the `CallEvent` and the > `CheckerContext` have (references to) the same state. > > * Checker callbacks that modify the state (at suitable points e.g. the > PreCall callback) to ensure that states that _may_ be different are actually > different. It wouldn't be too difficult to find "real" checkers that cause > suitable state changes, but to make the test simple and self-contained I'm > leaning towards a "synthetic" state change from a debug checker (the same one > that also does the validation). > A bug report visitor doesn't help with either of these components, > because it cannot observe the state attached to the `CallEvent` (it is just > passed to the checker callback, but not persisted in the `ExplodedGraph` for > later use by e.g. visitors) and it (obviously) cannot update the state. > > > > From this perspective, yeah, it may be slightly useful to keep the original > > pre-call state around. Say, if you're modeling a function `void foo(int > > *arg)` that effectively does something like `if (*arg == 0) *arg = 1;`, and > > your checker callback is sufficiently complex, then at some point you may > > need to ask "Uhh can you please remind me what the original value was?" - > > and then `CallEvent` comes in helpful. But I don't think this minor benefit > > outweighs the confusion caused by calling that very specific program state > > "the state". I think it's much easier to handle that by deliberately > > remembering the original state in a local variable, and then pass that > > state around. It's probably not a popular situation in the first place. > > I didn't even think about this usecase of the "archived" `State` attached to > the call event. I agree that this could be useful within checker code, and I > also agree that this is probably a rare situation and should be handled by > explicitly saving the original state in a local variable. (Note: this "state > attached to call becomes obsolete within the checker" situation is distinct > from the "checker callback receives a CallEvent with a state that is already > obsolete at the beginning of the call" bug which is fixed by this PR.) > > > First, yeah, like you said, we could have a static or dynamical typestate > > that indicates whether `getReturnValue()` makes sense. (It only ever makes > > sense in `checkPostCall`.) > > I support this idea -- writing a dynamic check would be trivial, and even a > static check is not too difficult. > > > Then, it might be a good idea to simply make `CallEvent.getState()` > > _private_. It wasn't our goal to provide convenient access to the entire > > state. We just wanted a nice `getArgSVal()` and such. Maybe let's provide > > only the actually-useful, non-confusing methods? > > Making `CallEvent.getState()` private would be a very easy "harm prevention" > change that would reduce the damage potential of this footgun. A quick search > reveals that outside of `CallEvent.{cpp,h}` this method is only used twice, > and in both of those locations it would be very simple to get the state from > the readily available `CheckerContext` instead. (This is very fortunate -- I > would have guessed that there is more of this...) In fact I'll soon create a > trivial separate PR for this proposal, because it's a trivial change and I > don't see any downside. > > Unfortunately this is just "sweeping the problems under the rug" and not a > complete solution -- even if we cannot directly access the obsolete old state > object, the analyzer may still run into logically wrong behavior if methods > like `getArgSVal` produce obsolete results from the old state. There are > probably situations where the old state is not _too_ old and the values that > are looked up from it happen to be correct (e.g. a PreCall callback can't > rebind the value of an expression in the `Environment`), but reasoning about > the partial correctness of old states is not a safe long-term solution. > > > Alternatively, it may be a good idea to implement these methods in > > `CheckerContext` itself, and then stop showing `CallEvent` to the user. So > > instead of `CallEvent.getArgSVal()` you'd need to type > > `CheckerContext.getCallArgSVal()`. Unfortunately this disconnects the user > > from the entire polymorphism of the `CallEvent` class. There's actually a > > lot of methods that would need to be reimplemented (like > > `AnyCXXConstructorCall.getCXXThisVal()` or > > `CXXAllocatorCall.getArraySizeVal()) and most of them wouldn't make sense > > most of the time. > > WDYT about a change that keeps the `CallEvent` type hierarchy but ensures > that `CallEvent`s can only be constructed by the `CheckerContext`? I can > envision a situation where: > > * There are "empty"/"stateless"/"template" `CallEvent` objects that are > passed around in the engine e.g. in `ExprEngine::evalCall`. > > * The constructor of `CheckerContext` can (optionally) own a `CallEvent` > (of this empty stateless kind). > > * Checkers can use a new method called `CheckerContext::getCallEvent()` > which combines the stateless call event with the state of the > `CheckerContext` to return a "regular" `CallEvent`. > > > > I think the assertion I propose is the easiest way to test this patch. > > I.e., the fact that the newly added assertion used to crash without the > > rest of the patch but doesn't crash anymore, is a sufficient defense > > against regressions in and of itself. > > @haoNoQ Are you speaking about the assertion to validate that > `CheckerContext.getState() == CallEvent.getState()`? I agree that this > assertion enforces an invariant that should be held, but unfortunately I > don't see a "natural place" for it. > > * Placing this on the engine side of the engine/checker boundary (just > before calling the callback) seems natural, but that's very close to the > location where the `CheckerContext` and the (updated) `CallEvent` are created > -- so we would just write a tautologically passing check that almost looks > like `foo = bar; assert(foo == bar);`. > > * Placing this assertion into every checker callback (that takes a > `CallEvent`) would be too much "pollution" IMO. > > * Finally, we could place this assertion in a debug checker that is > designed for this purpose and activated in a test that ensures that its > callbacks are triggered (with frequent state changes to reveal the use of > obsolete states). This is essentially equivalent to the testing approach that > I suggested.
(I hope I answered everything. If I forgot something, feel free to ask for clarification.) https://github.com/llvm/llvm-project/pull/160707 _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
