[PATCH] D99086: [clang][Syntax] Optimize expandedTokens for token ranges.

Utkarsh Saxena via Phabricator via cfe-commits Mon, 22 Mar 2021 12:07:42 -0700

usaxena95 added inline comments.


================
Comment at: clang/include/clang/Tooling/Syntax/Tokens.h:372
+  // useful while finding expanded tokens in a 'token range'.
+  llvm::DenseMap<SourceLocation, unsigned> ExpandedTokIndex;
   llvm::DenseMap<FileID, MarkedFile> Files;
----------------
kadircet wrote:
> this definitely makes sense, but most of the users of `TokenBuffers` are 
> currently invoking `expandedTokens(Range)` only once, so i am not sure if 
> this is a huge benefit on their-side, whereas they'll likely end up seeing 
> some short-lived memory consumption (and an extra traversal of all the 
> expanded tokens).
> 
> Any reasons for not moving this logic to the application? `expandedTokens()` 
> endpoint will return all the tokens, no questions asked. Then the application 
> can build whatever acceleration structure it wants. If you think there are 
> in-tree use cases that can benefit from this caching, maybe even provide that 
> as a separate helper?
> this definitely makes sense, but most of the users of TokenBuffers are 
> currently invoking expandedTokens(Range) only once
I didn't get that users are invoking it only once. This is not caching the 
results of `expandedTokens(Range)`. We are precomputing these to completely 
avoid a binary search which uses `isBeforeInTranslationUnit` (considerably 
slower). Doing an extra traversal of expandedTokens is not noticeable as 
compared to `isBeforeInTranslationUnit` latency. Users like SelectionTree 
(GoToDefinition, Hover,..) can directly benefit from this even though they 
query this only once per some range.

Umm. If you meant use cases like: build token buffer -> use 
expandedTokens(Range) -> destroy TokenBuffer.
I don't immediately see such use cases. But I think it is still better than 
using isBeforeInTranslationUnit.
My mental model is that TokenBuffer is a long lived object (example in 
ParsedAST in clangd). And using it only once and then discarding it is 
expensive in itself.

> Any reasons for not moving this logic to the application?
> If you think there are in-tree use cases that can benefit from this caching, 
> maybe even provide that as a separate helper?
This can basically benefit all users of `expandedToken(Range)`. Since 
expandedToken(Range) is provided by TokenBuffer, the optimization must also 
remain in the TokenBuffer as compared to the application.
Although users which only use this for non-token ranges will pay additional 
cost. We can potentially make this precomputation optional for such cases.





Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99086/new/

https://reviews.llvm.org/D99086

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D99086: [clang][Syntax] Optimize expandedTokens for token ranges.

Reply via email to