[PATCH] D116328: [ast-matchers] Add hasSubstatement() traversal matcher for caseStmt(), defaultStmt(), labelStmt()

Aaron Ballman via Phabricator via cfe-commits Wed, 05 Jan 2022 12:35:18 -0800

aaron.ballman added a reviewer: sammccall.
aaron.ballman added a subscriber: sammccall.
aaron.ballman added a comment.


In D116328#3223268 <https://reviews.llvm.org/D116328#3223268>, 
@LegalizeAdulthood wrote:

> In D116328#3221995 <https://reviews.llvm.org/D116328#3221995>, @aaron.ballman 
> wrote:
>
>>> Previously if you wanted to match the statement associated with a case, 
>>> default, or labelled statement, you had to use hasDescendant.
>>
>> This isn't true. You can use `has()` to do direct substatemematching, and 
>> I'm wondering whether we need this new matcher at all. e.g., 
>> https://godbolt.org/z/K5sYP757r
>
> Well, the whole point of this was that previous review comments were 
> complaining that I was using an "expensive" matcher
> instead of one that got right to the point.  (What's the difference between 
> `has` and `hasDescendent` anyway?)

`has()` matches the direct descendant AST node, whereas `hasDescendant()` 
matches *any* descendant AST node. So `hasDescendant()` actually is expensive 
-- it can traverse the AST more times than you'd expect. I don't know of any 
performance concerns with `has()` though.

e.g., https://godbolt.org/z/n3adEz4qW (the first query matches nothing, the 
second query has a match)

> I looked at the implementation of "has" and it has a huge amount of machinery 
> underneath it.  I drilled in as follows (my ToT hash is 
> 773ab3c6655f4d2beec25bb3516b4d4fe2eea990 
> <https://reviews.llvm.org/rG773ab3c6655f4d2beec25bb3516b4d4fe2eea990>):
> ASTMatchers.h:3391 declares `has` as an instance of 
> `internal::ArgumentAdaptingMatcherFunc<internal::HasMatcher>`
> `ArgumentAdaptingMatcherFun` appears just to be a factory mechanism for 
> creating the appropriate matcher.
> `HasMatcher` seems to do the actual matching
> `HasMatcher<T, ChildT>::matches` delegates to 
> `ASTMatchFinder::matchesChildOf(const T &Node, ...)`
> `ASTMatchFinder::matchesChildOf(const T &Node, ...)` delegates to another 
> overload of `matchesChildOf`(const DynTypedNode &Node, ...)` after asserting 
> a static type relationship (ASTMatchersInternal.h:762)
> `ASTMatchFinder::matchesChildOf(const DynTypedNode &Node, ...)` is a virtual 
> function with one implementation in `MatchASTVisitor` (ASTMatchFinder.cpp:659)
> `MatchASTVisitor::matchesChildOf` delegates to `memoizedMatchesRecursively` 
> (ASTMatchFinder.cpp:664)
> For non-memoizable nodes, `MatchASTVisitor::memoizedMatchesRecursively` 
> delegates to `matchesRecursively` (ASTMatchFinder.cpp:599)
> `MatchASTVIsitor::matchesRecursively` instantiates a 
> `ASTNodeNotSpelledInSourceScope` and a `MatchChildASTVisitor` upon which it 
> calls `findMatch` (ASTMatchFinder.cpp:645)
> `MatchChildASTVisitor::findMatch` does a series of `DynNode.get<T>` checks 
> and delegates to the appropriate overload of 
> `MatchChildASTVisitor::traverse`, in our case it would be the one for `Stmt`
> `MatchChildASTVisitor::traverse<T>(const T &Node)` delegates to 
> `MatchChildASTVisitor::match<T>(const T &Node)`
> `MatchChildASTVisitor::match<T>(const T &Node)` instantiates a 
> `BoundNodesTreeBuilder` and calls `match` on the held `Matcher`.
>
> Honestly, at this point, I become lost in manually tracing the code through 
> "go to definition" in my IDE.

I'm impressed you got this far -- the AST matching machinery under the hood is 
really quite complex! :-)

> So I switched my unit test for `HasCaseSubstmt` to use `has` and drilled in 
> through the debugger.
> What I saw echoes the above, although I took the wrong branch in my manual 
> analysis in `memoizedMatchesRecursively`, it goes down the memoization path.

Phew, that's good -- I would expect `has` to use the memoization pass.

> However, the whole point of memoizing a result is because that result was 
> expensive to compute and the memoization saves time on subsequent queries
> because the result has been memorized.  In this case, it's a simple getter on 
> `CaseStmt` to obtain the node against which you want to match, so I don't
> see the memoization as having any particular benefit.  (The inner matcher to 
> `hasSubstatement` may be expensive and could be memoized.)  For results
> obtain by the visitor pattern, I can see why you'd want to memoize them as 
> there is lots of code being executed to implement the Visitor pattern.

My understanding (and my recollection here may be wrong) is that `has()` can 
memoize the results because the sub-matcher will always be matched against the 
immediate direct descendant AST node, but you can't get the same behavior from 
`hasDescendant()` or `hasAncestor()` because any of the descendant (or 
ancestor) nodes can be matched, so you have to traverse them instead of 
memoizing. I'm adding @sammccall in case he has information about the 
performance characteristics or thoughts about the proposed matcher in general.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116328/new/

https://reviews.llvm.org/D116328

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D116328: [ast-matchers] Add hasSubstatement() traversal matcher for caseStmt(), defaultStmt(), labelStmt()

Reply via email to