ioeric added inline comments.
================ Comment at: clang-tools-extra/clangd/index/dex/Trigram.cpp:31 +/// use it as the special symbol. +const auto END_MARKER = '$'; + ---------------- nit: s/auto/char/ Maybe just use `static` instead of an anonymous namespace just for this. ================ Comment at: clang-tools-extra/clangd/index/dex/Trigram.cpp:83 + + bool FoundFirstHead = false; + // Iterate through valid seqneces of three characters Fuzzy Matcher can ---------------- It's probably unclear what `FoundFirstHead` and `FoundSecondHead` are for to readers without context (i.e. we are looking first two HEADs). I think it's would be cleaner if we handle this out side of the look e.g. record the first head in the `Next` initialization loop above. ================ Comment at: clang-tools-extra/clangd/index/dex/Trigram.cpp:138 DenseSet<Token> UniqueTrigrams; std::deque<char> Chars; ---------------- nit: `Chars` is only used in the else branch? ================ Comment at: clang-tools-extra/clangd/index/dex/Trigram.cpp:143 + if (ValidSymbolsCount < 3) { + std::string Symbols = {{END_MARKER, END_MARKER, END_MARKER}}; + if (LowercaseQuery.size() > 0) ---------------- I think this can be simplified as: ``` std::string Chars = LowercaseQuery.substr(std::min(3, LowercaseQuery.size())); Chars.append(END_MARKER, 3-Chars.size()); UniqueTrigrams.insert(Token(Token::Kind::Trigram, Chars)); ``` also nit: avoid using the term "symbol" here. ================ Comment at: clang-tools-extra/clangd/index/dex/Trigram.cpp:167 - Chars.push_back(LowercaseQuery[I]); + if (Chars.front() == END_MARKER) + auto Trigram = Token(Token::Kind::Trigram, ---------------- When would this happen? And why are we reversing the trigram and throwing it away? ================ Comment at: clang-tools-extra/clangd/index/dex/Trigram.h:59 +/// * Bigram with the first two symbols (if availible), same logic applies. +/// /// Note: the returned list of trigrams does not have duplicates, if any trigram ---------------- ioeric wrote: > I think the comment can be simplified a bit: > ``` > This also generates incomplete trigrams for short query scenarios: > * Empty trigram: "$$$" > * Unigram: the first character of the identifier. > * Bigrams: a 2-char prefix of the identifier and a bigram of the first two > HEAD characters (if it exists). > ``` nit: and the previous paragraph `Special kind of trigrams ...` is redundant now IMO. https://reviews.llvm.org/D50517 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits