sammccall added inline comments.

================
Comment at: clang/include/clang/Tooling/Syntax/Tokens.h:372
+  // useful while finding expanded tokens in a 'token range'.
+  llvm::DenseMap<SourceLocation, unsigned> ExpandedTokIndex;
   llvm::DenseMap<FileID, MarkedFile> Files;
----------------
sammccall wrote:
> kadircet wrote:
> > usaxena95 wrote:
> > > kadircet wrote:
> > > > this definitely makes sense, but most of the users of `TokenBuffers` 
> > > > are currently invoking `expandedTokens(Range)` only once, so i am not 
> > > > sure if this is a huge benefit on their-side, whereas they'll likely 
> > > > end up seeing some short-lived memory consumption (and an extra 
> > > > traversal of all the expanded tokens).
> > > > 
> > > > Any reasons for not moving this logic to the application? 
> > > > `expandedTokens()` endpoint will return all the tokens, no questions 
> > > > asked. Then the application can build whatever acceleration structure 
> > > > it wants. If you think there are in-tree use cases that can benefit 
> > > > from this caching, maybe even provide that as a separate helper?
> > > > this definitely makes sense, but most of the users of TokenBuffers are 
> > > > currently invoking expandedTokens(Range) only once
> > > I didn't get that users are invoking it only once. This is not caching 
> > > the results of `expandedTokens(Range)`. We are precomputing these to 
> > > completely avoid a binary search which uses `isBeforeInTranslationUnit` 
> > > (considerably slower). Doing an extra traversal of expandedTokens is not 
> > > noticeable as compared to `isBeforeInTranslationUnit` latency. Users like 
> > > SelectionTree (GoToDefinition, Hover,..) can directly benefit from this 
> > > even though they query this only once per some range.
> > > 
> > > Umm. If you meant use cases like: build token buffer -> use 
> > > expandedTokens(Range) -> destroy TokenBuffer.
> > > I don't immediately see such use cases. But I think it is still better 
> > > than using isBeforeInTranslationUnit.
> > > My mental model is that TokenBuffer is a long lived object (example in 
> > > ParsedAST in clangd). And using it only once and then discarding it is 
> > > expensive in itself.
> > > 
> > > > Any reasons for not moving this logic to the application?
> > > > If you think there are in-tree use cases that can benefit from this 
> > > > caching, maybe even provide that as a separate helper?
> > > This can basically benefit all users of `expandedToken(Range)`. Since 
> > > expandedToken(Range) is provided by TokenBuffer, the optimization must 
> > > also remain in the TokenBuffer as compared to the application.
> > > Although users which only use this for non-token ranges will pay 
> > > additional cost. We can potentially make this precomputation optional for 
> > > such cases.
> > > 
> > > 
> > > 
> > > I didn't get that users are invoking it only once.
> > 
> > Sorry that was confusing, I was trying to say that on use cases like 
> > Hover/Go-To etc. clangd builds selection tree only once and you are right, 
> > that might actually involve multiple calls to `expandedTokens(Range)` but 
> > that shouldn't be the bottleneck of these features.
> > 
> > 
> > > This is not caching the results of `expandedTokens(Range)`. We are 
> > > precomputing these to completely avoid a binary search
> > 
> > right, i was saying that all that precomputation is also probably unneeded, 
> > as normally features are only interested in a single token, which should 
> > nest properly in AST under normal circumstances and doesn't require calling 
> > expandedTokens for all possible ranges.
> > 
> > > Doing an extra traversal of expandedTokens is not noticeable as compared 
> > > to `isBeforeInTranslationUnit` latency. Users like SelectionTree 
> > > (GoToDefinition, Hover,..) can directly benefit from this even though 
> > > they query this only once per some range.
> > 
> > I was suggesting to move this into application because that felt 
> > counter-intuitive, as the benchmark you provided is actually running 
> > Hover(and other feautres) on every (spelled) token of the file. If you say 
> > that it is beneficial even for the normal hover case, sure let's go (but 
> > i'd be happier if you proved me wrong first 😅 )
> > 
> > > Umm. If you meant use cases like: build token buffer -> use 
> > > expandedTokens(Range) -> destroy TokenBuffer.
> > > I don't immediately see such use cases. But I think it is still better 
> > > than using isBeforeInTranslationUnit.
> > > My mental model is that TokenBuffer is a long lived object (example in 
> > > ParsedAST in clangd). And using it only once and then discarding it is 
> > > expensive in itself.
> > 
> > That's what clangd does though (not sure if Tokenbuffers have any other 
> > users ATM). It literally builds a new ParsedAST after every edit(It also 
> > has a cache to skip some work if contents didn't change between requests). 
> > But usually all of those requests are user-initiated, so an extra latency 
> > of a couple ms on every request isn't really noticeable.
> > Getting rid of that would still be great, but if it means slowing down the 
> > initial request for quite some time (due to extra traversal of all the 
> > expanded tokens, which might be a huge deal especially on files including 
> > some tablegen'd stuff, as they are usually part of the main file and not 
> > preamble).
> > 
> > > This can basically benefit all users of `expandedToken(Range)`.
> > 
> > As stated above, i am just worried that this might hurt the use cases in 
> > which application just builds a tokenbuffer and calls this on a few ranges. 
> > (maybe clangd is the odd one out, but i don't see any other users. there's 
> > `clang::syntax::computeReplacements` but it is only used in a test)
> > 
> > > Since expandedToken(Range) is provided by TokenBuffer, the optimization 
> > > must also remain in the TokenBuffer as compared to the application.
> > 
> > I don't follow the reasoning here exactly. The application that needs to 
> > operate on every token of the expanded file can call the wrapper rather 
> > than using TokenBuffers directly.
> > 
> > But... The most dominant application in clangd's case is SelectionTree and 
> > if it is going to perform this computation all the time (because it wants 
> > to support both batch and single invocation use-case :/ ). As you suggested 
> > this can be optional (or triggered with an explicit endpoint) but that's 
> > gonna be ugly, as it needs to be supplied when building a ParsedAST, ugh..
> > 
> > (Sorry for drawing a little circle here) I suppose it is cleanest the way 
> > you propose it then (unless your application is actually building 
> > ParsedAST's itself without any clangd abstraction like 
> > TUScheduler/ClangdServer).
> > It would still be great if you could check for possible regressions on the 
> > common path (the easiest would be to just look at `ParsedAST::build` 
> > latencies on a couple files, i suppose 
> > `clang/lib/Serialization/ASTReader.cpp` is a good example for the 
> > pathological case, winning there would probably mean winning in all the 
> > other cases too, but it would be good to see a more "sane" file too. As 
> > anything after building the `ParsedAST` is guaranteed to be as fast as 
> > before.
> > 
> > > Although users which only use this for non-token ranges will pay 
> > > additional cost. We can potentially make this precomputation optional for 
> > > such cases.
> > 
> > I suppose to hit the slow case users should provide spelled tokens that did 
> > not expand to anything, which should be rare, so I don't think that'll be a 
> > huge problem.
> Long thread here :-)
> 
> Kadir: one wrinkle that's maybe missed here is that clangd's SelectionTree 
> calls this method *many* times to establish bounds of many AST nodes. I fully 
> believe this is dominating SelectionTree build time (though it would be nice 
> to see benchmarks).
> the benchmark you provided is actually running Hover(and other feautres) on 
> every (spelled) token of the file

I finally understood this. You're right, but we want to reuse some of the 
clangd code in batch mode (for code review workflows), which means exactly this!

> If you say that it is beneficial even for the normal hover case

I'd love to have numbers, but my understanding is: parse >> selectiontree > 
hover. So this should be a >50% speedup for hover **in the cached AST case**. 
Though that case is fast anyway, so it's not critically important.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99086/new/

https://reviews.llvm.org/D99086

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to