sammccall accepted this revision. sammccall added a comment. This revision is now accepted and ready to land.
Great stuff! Just nits (though a few of them; this is a tricky patch!). ================ Comment at: clangd/index/Background.cpp:122 +inline BackgroundIndex::FileDigest digest(StringRef Content) { + return SHA1::hash({(const uint8_t *)Content.data(), Content.size()}); ---------------- static rather than inline? ================ Comment at: clangd/index/Background.cpp:135 + +// Resolves URI to file paths with cache. +StringRef BackgroundIndex::uriToPath(StringRef FileURI, StringRef HintPath) { ---------------- Might be cleanest to wrap the cache, the URISchemes&, the hint path, and this function into a tiny class ================ Comment at: clangd/index/Background.cpp:160 + const StringMap<FileDigest> &FilesToUpdate) { + // Partition symbols/references into files. + struct File { ---------------- Maybe add a comment "eventually SymbolCollector may do this for us to avoid all this copying"? ================ Comment at: clangd/index/Background.cpp:311 SPAN_ATTACH(Tracer, "refs", int(Refs.numRefs())); - // FIXME: partition the symbols by file rather than TU, to avoid duplication. - IndexedSymbols.update(AbsolutePath, - llvm::make_unique<SymbolSlab>(std::move(Symbols)), - llvm::make_unique<RefSlab>(std::move(Refs))); - FileHash[AbsolutePath] = Hash; + update(AbsolutePath, std::move(Symbols), std::move(Refs), FilesToUpdate); + IndexedFileDigests[AbsolutePath] = Hash; ---------------- It looks like you never write the FilesToUpdate hashes back to the IndexedFileDigests, so we'll always update headers? ================ Comment at: clangd/index/Background.h:55 + using FileDigest = decltype(llvm::SHA1::hash({})); + using FileDigests = llvm::StringMap<FileDigest>; ---------------- private. ================ Comment at: clangd/index/Background.h:75 + FileDigests IndexedFileDigests; // Key is absolute file path. + llvm::StringMap<std::string> URIToPathCache; ---------------- There's no need for this to be global to the class, it's going to cause some hassle once there are multiple worker threads. I think can a cache can be scoped to a single call to update() (and both it and uriToPath() can be hidden in the cpp file.) ================ Comment at: clangd/index/FileIndex.cpp:123 + case DuplicateHandling::Merge: { + // Merge slabs into a single slab and only keep alive the merged. + DenseMap<SymbolID, Symbol> Merged; ---------------- comment is stale ================ Comment at: clangd/index/FileIndex.cpp:140 + } + case DuplicateHandling::PickOne: { + for (const auto &Slab : SymbolSlabs) ---------------- since we're deduplicating above, we could deduplicate here too and remove the deduplicating logic from Dex (MemIndex gets it by mistake). That would avoid duplicated deduplication (!) in the Merge + Dex case, and seems a little cleaner. Not something for this patch, but maybe add a FIXME. ================ Comment at: clangd/index/SymbolCollector.cpp:212 + auto I = FilesToIndexCache->try_emplace(FID); + if (!I.second) + return I.first->second; ---------------- nit: simplify logic by inverting the if here and sharing the return? ================ Comment at: clangd/index/SymbolCollector.cpp:441 std::string FileURI; - if (auto DeclLoc = getTokenLocation(MI->getDefinitionLoc(), SM, Opts, - PP->getLangOpts(), FileURI)) + auto DefLoc = MI->getDefinitionLoc(); + // FIXME: use the result to filter out symbols. ---------------- why is this hoisted above shouldIndexFile? or did you mean to use DefLoc below? ================ Comment at: clangd/index/SymbolCollector.cpp:442 + auto DefLoc = MI->getDefinitionLoc(); + // FIXME: use the result to filter out symbols. + shouldIndexFile(SM, SM.getFileID(Loc), Opts, &FilesToIndexCache); ---------------- just to check - this is basically trivial to do now but would need tests etc? (fine to defer from this patch) ================ Comment at: clangd/index/SymbolCollector.h:78 + /// `FileFilter(SM, FID)` is true. If not set, all files are indexed. + std::function<bool(const SourceManager &, FileID)> FileFilter = nullptr; }; ---------------- I think we should explicitly test this in SymbolCollectorTests - I think it should be straightforward? ================ Comment at: unittests/clangd/BackgroundIndexTests.cpp:28 // a.h yields different symbols when included by A.cc vs B.cc. // Currently we store symbols for each TU, so we get both. + FS.Files[testPath("root/A.h")] = R"( ---------------- stale comment? ================ Comment at: unittests/clangd/BackgroundIndexTests.cpp:29 // Currently we store symbols for each TU, so we get both. - FS.Files[testPath("root/A.h")] = "void a_h(); void NAME(){}"; - FS.Files[testPath("root/A.cc")] = "#include \"A.h\""; - FS.Files[testPath("root/B.cc")] = "#define NAME bar\n#include \"A.h\""; - BackgroundIndex Idx(Context::empty(), "", FS); + FS.Files[testPath("root/A.h")] = R"( + void common(); ---------------- nit: R"cpp( ================ Comment at: unittests/clangd/BackgroundIndexTests.cpp:33 + #if A + class A_H {}; + #else ---------------- naming here seems confusing: A_H seems to refer to the header, but then what does B_H refer to? ================ Comment at: unittests/clangd/BackgroundIndexTests.cpp:40 + "#include \"A.h\"\nvoid g() { (void)common; }"; + FS.Files[testPath("root/B.cc")] = + "#define A 0\n#include \"A.h\"\nvoid f_b() { (void)common; }"; ---------------- these should be raw strings now I think (sorry, the old test was already marginal) Repository: rCTE Clang Tools Extra https://reviews.llvm.org/D53433 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits