[PATCH] D80366: [Analyzer] Add `getReturnValueUnderConstruction()` to `CallEvent`

Gabor Marton via Phabricator via cfe-commits Thu, 28 May 2020 09:17:15 -0700

martong added inline comments.


================
Comment at: clang/lib/StaticAnalyzer/Core/ExprEngineCXX.cpp:112
 
+Optional<SVal> ExprEngine::retrieveFromConstructionContext(
+    ProgramStateRef State, const LocationContext *LCtx,
----------------
baloghadamsoftware wrote:
> NoQ wrote:
> > balazske wrote:
> > > baloghadamsoftware wrote:
> > > > baloghadamsoftware wrote:
> > > > > baloghadamsoftware wrote:
> > > > > > baloghadamsoftware wrote:
> > > > > > > NoQ wrote:
> > > > > > > > baloghadamsoftware wrote:
> > > > > > > > > NoQ wrote:
> > > > > > > > > > Please instead re-use the code that computes the object 
> > > > > > > > > > under construction. That'll save you ~50 lines of code and 
> > > > > > > > > > will be more future-proof (eg., standalone temporaries 
> > > > > > > > > > without destructor technically have a construction context 
> > > > > > > > > > with 0 items so when we implement them correctly your 
> > > > > > > > > > procedure will stop working).
> > > > > > > > > That was so my first thought. However, 
> > > > > > > > > `handleConstructionContext()` is private and non-static. Now 
> > > > > > > > > I tried to merge the two methods: if the value is already in 
> > > > > > > > > the construction context, we return it, if not then we add 
> > > > > > > > > it. Is this what you suggest? Or did I misunderstand you? At 
> > > > > > > > > the very beginning I tried to simply use 
> > > > > > > > > `handleConstructionContext()`, but it asserted because the 
> > > > > > > > > value was already in the map.
> > > > > > > > I'd like to preserve the assertion that 
> > > > > > > > objects-under-construction are never filled twice; it's a very 
> > > > > > > > useful sanity check. What you need in your checker is a 
> > > > > > > > function that computes object-under-construction but doesn't 
> > > > > > > > put it into the objects-under-construction map. So you have to 
> > > > > > > > separate the computation from filling in the state.
> > > > > > > OK, so I (fortunately) misundertood you. Thus I should refactor 
> > > > > > > this function to a calculation and a storing part?
> > > > > > OK, I see what you are speaking about, but I have no idea how to do 
> > > > > > it properly. The problem is that the control logic of filling in 
> > > > > > the state also depends on the kind of the construction context. For 
> > > > > > some kinds we do not fill at all. Every way I try it becomes more 
> > > > > > complex and less correct:
> > > > > > 
> > > > > > 1) `NewAllocatedObjectKind`: we do not add this to the state, we 
> > > > > > only retrieve the original.
> > > > > > 2) `SimpleReturnedValueKind` and 
> > > > > > `CXX17ElidedCopyReturnedValueKind`: depending on whether we are in 
> > > > > > top frame we handle this case recursively or we do not fill at all, 
> > > > > > just return the value. What is the construction context item here? 
> > > > > > Maybe the `ReturnStmt`?
> > > > > > 3) `ElidedTemporaryObjectKind`: this is the most problematic: we 
> > > > > > first handle it recursively for the construction context after 
> > > > > > elision, then we also fill it for the elided temporary object 
> > > > > > construction context as well.
> > > > > > 
> > > > > > The only thing I can (maybe) do is to retrieve the construction 
> > > > > > context item. But then the switch is still duplicated for filling, 
> > > > > > because of the different control logic for different construction 
> > > > > > context kinds.
> > > > > > The only thing I can (maybe) do is to retrieve the construction 
> > > > > > context item.
> > > > > 
> > > > > This does not help even for retrieving the value, because its control 
> > > > > logic also depends on the construction context kind. If I do it, it 
> > > > > will be code triplication instead of duplication and results in a 
> > > > > code that is worse to understand than the current one.
> > > > > 
> > > > > 
> > > > > Please instead re-use the code that computes the object under 
> > > > > construction. That'll save you ~50 lines of code and will be more 
> > > > > future-proof (eg., standalone temporaries without destructor 
> > > > > technically have a construction context with 0 items so when we 
> > > > > implement them correctly your procedure will stop working).
> > > > 
> > > > Any solution I can come up with rather adds 100 to 150 lines to the 
> > > > code instead of saving 50. For retrieving we only have to determine the 
> > > > construction context item (the key). For storing we also have to 
> > > > calculate the value. It sounds good to calculate these things in 
> > > > separate functions and then have a simple retriever and store function 
> > > > but there are lots of recursions, double strores, non-stores, 
> > > > retrieving in the store function that make it too complicated.
> > > > 
> > > > Also retrieving is more complex than just determining the item and 
> > > > getting the value from the context: if you look at 
> > > > `SimpleReturnedValueKind` and `CXX17ElidedCopyReturnedValueKind` you 
> > > > can see that we do not use the construction context we got in the 
> > > > parameter (the construction context of the return value) but the 
> > > > construction context of the call instead. For 
> > > > `ElidedTemporaryObjectKind` we take the construction context before the 
> > > > elusion.
> > > > 
> > > > Future proofness: I agree, this is a problem to solve. In cases where 
> > > > the construction context is empty maybe we can move the calculation of 
> > > > the values to a separate function that is called by both the "handler" 
> > > > and the "retriever".
> > > Do I think correctly that `retrieveFromConstructionContext` (in the 
> > > previous version) should return the same value as second part of 
> > > `handleConstructionContext`? It does not look obvious at first. Or do we 
> > > need a function with that purpose? If yes it looks a difficult task 
> > > because the similar "logic" to get the value and update the state. 
> > > Probably it is enough to add a flag to `handleConstructionContext` to not 
> > > make new state. The current code looks bad because calling a "handle" 
> > > type of function (that was private before) only to compute a value.
> > > 
> > > Any solution I can come up with rather adds 100 to 150 lines to the code 
> > > instead of saving 50.
> > 
> > I think the following is a fairly literal implementation of my suggestion:
> > 
> > {F12020285}
> > 
> > It's 26 lines shorter than your `retrieveFromConstructionContext()` (ok, 
> > not 50, duplicating the switch was more annoying than i expected), it adds 
> > zero new logic (the only new piece of logic is to track constructors that 
> > were modeled correctly but their elision wasn't; previously we could fit it 
> > into control flow), it no longer tries to reverse-engineer the original 
> > behavior by duplicating its logic, and it's more future-proof for the 
> > reasons explained above.
> Thank you for working on this.  The point what I did not see (and I still no 
> not see it) is that in your solution (and your proposal) instead of 
> retrieving the value from the construction context we actually recalculate 
> it, thus we do not use the construction context at all (except for 
> `NewAllocatedObjectKind`) to retrieve the location of the return value. Can 
> we guarantee that we always recalculate exactly the same symbolic value as 
> the symbolic values we store in the map? For example, for 
> `SimpleReturnedValueKind` et. al. we conjure a new symbolic value. Is it 
> really the same value than that the one in the map, thus the one which is the 
> real location where the object is constructed? If not, then the checkers 
> might not work correctly because they store into the GDM using a different 
> region as key than they use for retrieving the value.
> I think the following is a fairly literal implementation of my suggestion: 
> ... It's 26 lines shorter than your retrieveFromConstructionContext() (ok, 
> not 50, duplicating the switch was more annoying than i expected), it adds 
> zero new logic (the only new piece of logic is to track constructors that 
> were modeled correctly but their elision wasn't; previously we could fit it 
> into control flow), it no longer tries to reverse-engineer the original 
> behavior by duplicating its logic, and it's more future-proof for the reasons 
> explained above.

+1 for this version. Even though Adam has valid concerns, this version contains 
functions with clear responsibilities.

> Can we guarantee that we always recalculate exactly the same symbolic value 
> as the symbolic values we store in the map? For example, for 
> SimpleReturnedValueKind et. al. we conjure a new symbolic value. Is it really 
> the same value than that the one in the map, thus the one which is the real 
> location where the object is constructed?

I think the guarantee is that in Artem's implementation the switch is 
duplicated. That would be okay for now, but I still consider this error-prone, 
exactly because of the duplication. So, maybe this yields for a common Visitor.

Guys, would it be possible to have a common Visitor for these? Could be 
somewhat similar to `DeclVisitor` with return type as type parameter. And we 
could pass a lambda that is called in every `case`? Or would that  be too 
overkill



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80366/new/

https://reviews.llvm.org/D80366



_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D80366: [Analyzer] Add `getReturnValueUnderConstruction()` to `CallEvent`

Reply via email to