[PATCH] D52219: [analyzer] (1/n) Support pointee mutation analysis in ExprMutationAnalyzer.

Jonas Toth via Phabricator via cfe-commits Thu, 20 Sep 2018 09:37:38 -0700

JonasToth added inline comments.


================
Comment at: unittests/Analysis/ExprMutationAnalyzerTest.cpp:67
+    if (const auto *DRE = dyn_cast<DeclRefExpr>(E)) {
+      if (DRE->getNameInfo().getAsString()[0] == 'p')
+        Finder = PointeeMutationFinder;
----------------
shuaiwang wrote:
> JonasToth wrote:
> > shuaiwang wrote:
> > > JonasToth wrote:
> > > > That doesn't look like a super idea. It is super hidden that variable 
> > > > 'p*' will be analyzed for pointee mutation and others aren't. 
> > > > Especially in 12 months and when someone else wants to change 
> > > > something, this will be the source of headaches.
> > > > 
> > > > Don't you think there could be a better way of doing this switch?
> > > Yeah... agree :)
> > > But how about let's keep it this way just for now, and I'll pause the 
> > > work on supporting pointee analysis and fix it properly but in a separate 
> > > diff following this one?
> > > 
> > > I've been thinking about this and also did some prototyping, and here are 
> > > my thoughts:
> > > In order to properly fix this we need to change the interface of 
> > > `ExprMutationAnalyzer` to not simply return a `Stmt` but instead return a 
> > > `MutationTrace` with additional metadata. This wasn't needed before 
> > > because we can reconstruct a mutation chain by continuously calling 
> > > `findMutation`, and that works well for cases like "int x; int& r0 = x; 
> > > int& r1 = r0; r1 = 123;". But now that pointers come into play, and we 
> > > need to handle cases like "int *p; int *p2 = p; int& r = *p2; r = 123;", 
> > > and in order to reconstruct the mutation chain, we need to do 
> > > `findPointeeMutation(p)` -> `findPointeeMutation(p2)` -> 
> > > `findMutation(r)`, notice that we need to switch from calling 
> > > `findPointeeMutation` to calling `findMutation`. However we don't know 
> > > where the switch should happen simply based on what's returned from 
> > > `findPointeeMutation`, we need additional metadata for that.
> > > 
> > > That interface change will unavoidably affect how memoization is done so 
> > > we need to change memoization as well.
> > > 
> > > Now since we need to change both the interface and memoization anyway, we 
> > > might as well design them properly so that multiple-layers of pointers 
> > > can be supported nicely. I think we can end up with something like this:
> > > ```
> > > class ExprMutationAnalyzer {
> > > public:
> > >   struct MutationTrace {
> > >     const Stmt *Value;
> > >     const MutationTrace *Next;
> > >     // Other metadata
> > >   };
> > > 
> > >   // Finds whether any mutative operations are performed on the given 
> > > expression after dereferencing `DerefLayers` times.
> > >   const MutationTrace *findMutation(const Expr *, int DerefLayers = 0);
> > >   const MutationTrace *findMutation(const Decl *, int DerefLayers = 0);
> > > };
> > > ```
> > > This involves quite some refactoring, and doing this refactoring before 
> > > this diff and later rebasing seems quite some trouble that I'd like to 
> > > avoid.
> > > 
> > > Let me know what do you think :)
> > Is the second part of your answer part to adressing this testing issue?
> > Given the whole Analyzer is WIP it is ok to postpone the testing change to 
> > later for me. But I feel that this is a quality issue that more experience 
> > clang develops should comment on because it still lands in trunk. Are you 
> > aware of other patches then clang-tidy that rely on EMA?
> > 
> > Regarding you second part (its related to multi-layer pointer mutation?!):
> > Having a dependency-graph that follows the general data flow _with_ 
> > mutation annotations would be nice to have. There is probably even 
> > something in clang (e.g. `CFG` is probably similar in idea, just with all 
> > execution paths), but I don't know enough to properly judge here.
> > 
> > In my opinion it is ok, to not do the refactoring now and just support one 
> > layer of pointer indirection. Especially in C++ there is usually no need 
> > for multi-layer pointers but more a relict from C-style.
> > C on the other hand relies on that.
> > Designing EMA new for the more generalized functionality (if necessary) 
> > would be of course great, but again: Better analyzer ppl should be involved 
> > then me, even though I would still be very interested to help :)
> The second part is more of trying to explain why I'd prefer to keep the test 
> this way just for this diff.
> Basically we need a refactoring that makes EMA returns MutationTrace in order 
> to fix this test util here.
> 
> But thinking more about this, maybe we don't need that (for now), this util 
> is to help restore a mutation chain, and since we don't support a chain of 
> mutation for pointers yet we can have a simpler version of `pointeeMutatedBy` 
> that doesn't support chaining.
> 
> I mentioned multi-layer pointer mutation because _if_ we were to do a 
> refactoring, we'd better just do it right and take possible features needed 
> in the future into account. But the refactoring itself is motivated by the 
> need of fixing this testing util (and making the return value from 
> `findMutation` generally more useful.)
> 
> BTW by `MutationTrace` it's not as complicated or sophisticated as `CFG`, 
> it's simply a trace helping understand why the mutation happens, for example 
> in this case:
> ```
> int x;
> int &y = x;
> int &z = y;
> z = 123;
> ```
> it's a bit disconnected if we just say `z = 123` mutated 
> `declRefExpr(to(x))`, instead:
> - `findMutation(x)` returns `y`
> - `findMutation(y)` returns `z`
> - `findMutation(z)` returns `z = 123`
> This doesn't apply to (majority) of cases where the mutative expression is an 
> ancestor of the given expression, in those cases it's typically quite clear 
> why the ancestor expression mutated its descendant.
> 
> Anyway, I'll remove this little hackery here and address other comments later 
> (hopefully today), and I'll do a refactoring before pointer analysis supports 
> chaining.
Ok, understanding the decision making of the EMA would be a good thing to have 
in general as well.


Repository:
  rC Clang

https://reviews.llvm.org/D52219



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D52219: [analyzer] (1/n) Support pointee mutation analysis in ExprMutationAnalyzer.

Reply via email to