sammccall added inline comments.
================ Comment at: clang/include/clang/Tooling/Syntax/Tokens.h:372 + // useful while finding expanded tokens in a 'token range'. + llvm::DenseMap<SourceLocation, unsigned> ExpandedTokIndex; llvm::DenseMap<FileID, MarkedFile> Files; ---------------- sammccall wrote: > kadircet wrote: > > usaxena95 wrote: > > > kadircet wrote: > > > > this definitely makes sense, but most of the users of `TokenBuffers` > > > > are currently invoking `expandedTokens(Range)` only once, so i am not > > > > sure if this is a huge benefit on their-side, whereas they'll likely > > > > end up seeing some short-lived memory consumption (and an extra > > > > traversal of all the expanded tokens). > > > > > > > > Any reasons for not moving this logic to the application? > > > > `expandedTokens()` endpoint will return all the tokens, no questions > > > > asked. Then the application can build whatever acceleration structure > > > > it wants. If you think there are in-tree use cases that can benefit > > > > from this caching, maybe even provide that as a separate helper? > > > > this definitely makes sense, but most of the users of TokenBuffers are > > > > currently invoking expandedTokens(Range) only once > > > I didn't get that users are invoking it only once. This is not caching > > > the results of `expandedTokens(Range)`. We are precomputing these to > > > completely avoid a binary search which uses `isBeforeInTranslationUnit` > > > (considerably slower). Doing an extra traversal of expandedTokens is not > > > noticeable as compared to `isBeforeInTranslationUnit` latency. Users like > > > SelectionTree (GoToDefinition, Hover,..) can directly benefit from this > > > even though they query this only once per some range. > > > > > > Umm. If you meant use cases like: build token buffer -> use > > > expandedTokens(Range) -> destroy TokenBuffer. > > > I don't immediately see such use cases. But I think it is still better > > > than using isBeforeInTranslationUnit. > > > My mental model is that TokenBuffer is a long lived object (example in > > > ParsedAST in clangd). And using it only once and then discarding it is > > > expensive in itself. > > > > > > > Any reasons for not moving this logic to the application? > > > > If you think there are in-tree use cases that can benefit from this > > > > caching, maybe even provide that as a separate helper? > > > This can basically benefit all users of `expandedToken(Range)`. Since > > > expandedToken(Range) is provided by TokenBuffer, the optimization must > > > also remain in the TokenBuffer as compared to the application. > > > Although users which only use this for non-token ranges will pay > > > additional cost. We can potentially make this precomputation optional for > > > such cases. > > > > > > > > > > > > I didn't get that users are invoking it only once. > > > > Sorry that was confusing, I was trying to say that on use cases like > > Hover/Go-To etc. clangd builds selection tree only once and you are right, > > that might actually involve multiple calls to `expandedTokens(Range)` but > > that shouldn't be the bottleneck of these features. > > > > > > > This is not caching the results of `expandedTokens(Range)`. We are > > > precomputing these to completely avoid a binary search > > > > right, i was saying that all that precomputation is also probably unneeded, > > as normally features are only interested in a single token, which should > > nest properly in AST under normal circumstances and doesn't require calling > > expandedTokens for all possible ranges. > > > > > Doing an extra traversal of expandedTokens is not noticeable as compared > > > to `isBeforeInTranslationUnit` latency. Users like SelectionTree > > > (GoToDefinition, Hover,..) can directly benefit from this even though > > > they query this only once per some range. > > > > I was suggesting to move this into application because that felt > > counter-intuitive, as the benchmark you provided is actually running > > Hover(and other feautres) on every (spelled) token of the file. If you say > > that it is beneficial even for the normal hover case, sure let's go (but > > i'd be happier if you proved me wrong first 😅 ) > > > > > Umm. If you meant use cases like: build token buffer -> use > > > expandedTokens(Range) -> destroy TokenBuffer. > > > I don't immediately see such use cases. But I think it is still better > > > than using isBeforeInTranslationUnit. > > > My mental model is that TokenBuffer is a long lived object (example in > > > ParsedAST in clangd). And using it only once and then discarding it is > > > expensive in itself. > > > > That's what clangd does though (not sure if Tokenbuffers have any other > > users ATM). It literally builds a new ParsedAST after every edit(It also > > has a cache to skip some work if contents didn't change between requests). > > But usually all of those requests are user-initiated, so an extra latency > > of a couple ms on every request isn't really noticeable. > > Getting rid of that would still be great, but if it means slowing down the > > initial request for quite some time (due to extra traversal of all the > > expanded tokens, which might be a huge deal especially on files including > > some tablegen'd stuff, as they are usually part of the main file and not > > preamble). > > > > > This can basically benefit all users of `expandedToken(Range)`. > > > > As stated above, i am just worried that this might hurt the use cases in > > which application just builds a tokenbuffer and calls this on a few ranges. > > (maybe clangd is the odd one out, but i don't see any other users. there's > > `clang::syntax::computeReplacements` but it is only used in a test) > > > > > Since expandedToken(Range) is provided by TokenBuffer, the optimization > > > must also remain in the TokenBuffer as compared to the application. > > > > I don't follow the reasoning here exactly. The application that needs to > > operate on every token of the expanded file can call the wrapper rather > > than using TokenBuffers directly. > > > > But... The most dominant application in clangd's case is SelectionTree and > > if it is going to perform this computation all the time (because it wants > > to support both batch and single invocation use-case :/ ). As you suggested > > this can be optional (or triggered with an explicit endpoint) but that's > > gonna be ugly, as it needs to be supplied when building a ParsedAST, ugh.. > > > > (Sorry for drawing a little circle here) I suppose it is cleanest the way > > you propose it then (unless your application is actually building > > ParsedAST's itself without any clangd abstraction like > > TUScheduler/ClangdServer). > > It would still be great if you could check for possible regressions on the > > common path (the easiest would be to just look at `ParsedAST::build` > > latencies on a couple files, i suppose > > `clang/lib/Serialization/ASTReader.cpp` is a good example for the > > pathological case, winning there would probably mean winning in all the > > other cases too, but it would be good to see a more "sane" file too. As > > anything after building the `ParsedAST` is guaranteed to be as fast as > > before. > > > > > Although users which only use this for non-token ranges will pay > > > additional cost. We can potentially make this precomputation optional for > > > such cases. > > > > I suppose to hit the slow case users should provide spelled tokens that did > > not expand to anything, which should be rare, so I don't think that'll be a > > huge problem. > Long thread here :-) > > Kadir: one wrinkle that's maybe missed here is that clangd's SelectionTree > calls this method *many* times to establish bounds of many AST nodes. I fully > believe this is dominating SelectionTree build time (though it would be nice > to see benchmarks). > the benchmark you provided is actually running Hover(and other feautres) on > every (spelled) token of the file I finally understood this. You're right, but we want to reuse some of the clangd code in batch mode (for code review workflows), which means exactly this! > If you say that it is beneficial even for the normal hover case I'd love to have numbers, but my understanding is: parse >> selectiontree > hover. So this should be a >50% speedup for hover **in the cached AST case**. Though that case is fast anyway, so it's not critically important. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D99086/new/ https://reviews.llvm.org/D99086 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits