[PATCH] D50337: [clangd] DexIndex implementation prototype

2018-08-06 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev created this revision.
kbobyrev added reviewers: ioeric, ilya-biryukov.
kbobyrev added a project: clang-tools-extra.
Herald added subscribers: arphaman, mgrang, jkorous, MaskRay, mgorny.

This patch is a proof-of-concept Dex index implementation. It has several 
flaws, which don't allow replacing static MemIndex yet, such as:

- Not being able to handle queries of small size (less than 3 symbols); a way 
to solve this is generating trigrams of smaller size and having such incomplete 
trigrams in the index structure.
- Speed measurements: while manually editing files in Vim and requesting 
autocompletion gives an impression that the performance is at least comparable 
with the current static index, having actual numbers is important because we 
don't want to hurt the users and roll out slow code. Eric (@ioeric) suggested 
that we should only replace MemIndex as soon as we have the evidence that this 
is not a regression in terms of performance. An approach which is likely to be 
successful here is to wait until we have benchmark library in the LLVM core 
repository, which is something I have suggested in the LLVM mailing lists, 
received positive feedback on and started working on. I will add a dependency 
as soon as the suggested patch is out for a review (currently there's at least 
one complication which is being addressed by 
https://github.com/google/benchmark/pull/649). Key performance improvements for 
iterators are sorting by cost and the limit iterator.
- Quality measurements: currently, boosting iterator and two-phase lookup stage 
are not implemented, without these the quality is likely to be worse than the 
current implementation can yield. Measuring quality is tricky, but another 
suggestion in the offline discussion was that the drop-in replacement should 
only happen after Boosting iterators implementation (and subsequent query 
enhancement).

The proposed changes do not affect Clangd functionality or performance, 
`DexIndex` is only used in unit tests and not in production code.


https://reviews.llvm.org/D50337

Files:
  clang-tools-extra/clangd/CMakeLists.txt
  clang-tools-extra/clangd/index/MemIndex.h
  clang-tools-extra/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/clangd/index/dex/DexIndex.h
  clang-tools-extra/clangd/index/dex/Token.cpp
  clang-tools-extra/clangd/index/dex/Token.h
  clang-tools-extra/unittests/clangd/CMakeLists.txt
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp
  clang-tools-extra/unittests/clangd/IndexHelpers.cpp
  clang-tools-extra/unittests/clangd/IndexHelpers.h
  clang-tools-extra/unittests/clangd/IndexTests.cpp

Index: clang-tools-extra/unittests/clangd/IndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/IndexTests.cpp
+++ clang-tools-extra/unittests/clangd/IndexTests.cpp
@@ -7,33 +7,20 @@
 //
 //===--===//
 
+#include "IndexHelpers.h"
 #include "index/Index.h"
 #include "index/MemIndex.h"
 #include "index/Merge.h"
 #include "gmock/gmock.h"
 #include "gtest/gtest.h"
 
-using testing::UnorderedElementsAre;
 using testing::Pointee;
+using testing::UnorderedElementsAre;
 
 namespace clang {
 namespace clangd {
 namespace {
 
-Symbol symbol(llvm::StringRef QName) {
-  Symbol Sym;
-  Sym.ID = SymbolID(QName.str());
-  size_t Pos = QName.rfind("::");
-  if (Pos == llvm::StringRef::npos) {
-Sym.Name = QName;
-Sym.Scope = "";
-  } else {
-Sym.Name = QName.substr(Pos + 2);
-Sym.Scope = QName.substr(0, Pos + 2);
-  }
-  return Sym;
-}
-
 MATCHER_P(Named, N, "") { return arg.Name == N; }
 
 TEST(SymbolSlab, FindAndIterate) {
@@ -52,59 +39,6 @@
 EXPECT_THAT(*S.find(SymbolID(Sym)), Named(Sym));
 }
 
-struct SlabAndPointers {
-  SymbolSlab Slab;
-  std::vector Pointers;
-};
-
-// Create a slab of symbols with the given qualified names as both IDs and
-// names. The life time of the slab is managed by the returned shared pointer.
-// If \p WeakSymbols is provided, it will be pointed to the managed object in
-// the returned shared pointer.
-std::shared_ptr>
-generateSymbols(std::vector QualifiedNames,
-std::weak_ptr *WeakSymbols = nullptr) {
-  SymbolSlab::Builder Slab;
-  for (llvm::StringRef QName : QualifiedNames)
-Slab.insert(symbol(QName));
-
-  auto Storage = std::make_shared();
-  Storage->Slab = std::move(Slab).build();
-  for (const auto &Sym : Storage->Slab)
-Storage->Pointers.push_back(&Sym);
-  if (WeakSymbols)
-*WeakSymbols = Storage;
-  auto *Pointers = &Storage->Pointers;
-  return {std::move(Storage), Pointers};
-}
-
-// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
-// to the `generateSymbols` above.
-std::shared_ptr>
-generateNumSymbols(int Begin, int End,
-   std::weak_ptr *WeakSymbols = nullptr) {
-  std::vector Names;
-  for (int i = Begin; i <= End; i++)
-Names.push_back(std::to_string(i));
-  return ge

[PATCH] D50337: [clangd] DexIndex implementation prototype

2018-08-06 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev planned changes to this revision.
kbobyrev added a comment.

The patch is currently in preview-mode; I have to make few changes:

- Improve testing infrastructure; one possible way would be to use exactly the 
same code `MemIndex` currently does as it is meant to be a drop-in replacement. 
An existing obstacle would be not handling <3 long queries, but it's not hard 
to fix.
- Documenting `DexIndex` implementation and thinking about how to abstract out 
very similar code pieces shared with `MemIndex`. The proposed implementation is 
rather straightforward, but few pieces are identical to `MemIndex` which causes 
some code duplication.


https://reviews.llvm.org/D50337



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D50337: [clangd] DexIndex implementation prototype

2018-08-07 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 159463.
kbobyrev added a comment.

Don't resize retrieved symbols vector, simply let callback process at most 
`MaxCandidateCount` items.


https://reviews.llvm.org/D50337

Files:
  clang-tools-extra/clangd/CMakeLists.txt
  clang-tools-extra/clangd/index/MemIndex.h
  clang-tools-extra/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/clangd/index/dex/DexIndex.h
  clang-tools-extra/clangd/index/dex/Token.cpp
  clang-tools-extra/clangd/index/dex/Token.h
  clang-tools-extra/unittests/clangd/CMakeLists.txt
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp
  clang-tools-extra/unittests/clangd/IndexHelpers.cpp
  clang-tools-extra/unittests/clangd/IndexHelpers.h
  clang-tools-extra/unittests/clangd/IndexTests.cpp

Index: clang-tools-extra/unittests/clangd/IndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/IndexTests.cpp
+++ clang-tools-extra/unittests/clangd/IndexTests.cpp
@@ -7,33 +7,20 @@
 //
 //===--===//
 
+#include "IndexHelpers.h"
 #include "index/Index.h"
 #include "index/MemIndex.h"
 #include "index/Merge.h"
 #include "gmock/gmock.h"
 #include "gtest/gtest.h"
 
-using testing::UnorderedElementsAre;
 using testing::Pointee;
+using testing::UnorderedElementsAre;
 
 namespace clang {
 namespace clangd {
 namespace {
 
-Symbol symbol(llvm::StringRef QName) {
-  Symbol Sym;
-  Sym.ID = SymbolID(QName.str());
-  size_t Pos = QName.rfind("::");
-  if (Pos == llvm::StringRef::npos) {
-Sym.Name = QName;
-Sym.Scope = "";
-  } else {
-Sym.Name = QName.substr(Pos + 2);
-Sym.Scope = QName.substr(0, Pos + 2);
-  }
-  return Sym;
-}
-
 MATCHER_P(Named, N, "") { return arg.Name == N; }
 
 TEST(SymbolSlab, FindAndIterate) {
@@ -52,59 +39,6 @@
 EXPECT_THAT(*S.find(SymbolID(Sym)), Named(Sym));
 }
 
-struct SlabAndPointers {
-  SymbolSlab Slab;
-  std::vector Pointers;
-};
-
-// Create a slab of symbols with the given qualified names as both IDs and
-// names. The life time of the slab is managed by the returned shared pointer.
-// If \p WeakSymbols is provided, it will be pointed to the managed object in
-// the returned shared pointer.
-std::shared_ptr>
-generateSymbols(std::vector QualifiedNames,
-std::weak_ptr *WeakSymbols = nullptr) {
-  SymbolSlab::Builder Slab;
-  for (llvm::StringRef QName : QualifiedNames)
-Slab.insert(symbol(QName));
-
-  auto Storage = std::make_shared();
-  Storage->Slab = std::move(Slab).build();
-  for (const auto &Sym : Storage->Slab)
-Storage->Pointers.push_back(&Sym);
-  if (WeakSymbols)
-*WeakSymbols = Storage;
-  auto *Pointers = &Storage->Pointers;
-  return {std::move(Storage), Pointers};
-}
-
-// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
-// to the `generateSymbols` above.
-std::shared_ptr>
-generateNumSymbols(int Begin, int End,
-   std::weak_ptr *WeakSymbols = nullptr) {
-  std::vector Names;
-  for (int i = Begin; i <= End; i++)
-Names.push_back(std::to_string(i));
-  return generateSymbols(Names, WeakSymbols);
-}
-
-std::string getQualifiedName(const Symbol &Sym) {
-  return (Sym.Scope + Sym.Name).str();
-}
-
-std::vector match(const SymbolIndex &I,
-   const FuzzyFindRequest &Req,
-   bool *Incomplete = nullptr) {
-  std::vector Matches;
-  bool IsIncomplete = I.fuzzyFind(Req, [&](const Symbol &Sym) {
-Matches.push_back(getQualifiedName(Sym));
-  });
-  if (Incomplete)
-*Incomplete = IsIncomplete;
-  return Matches;
-}
-
 TEST(MemIndexTest, MemIndexSymbolsRecycled) {
   MemIndex I;
   std::weak_ptr Symbols;
@@ -212,18 +146,6 @@
   EXPECT_THAT(match(I, Req), UnorderedElementsAre("ns::ABC", "ns::abc"));
 }
 
-// Returns qualified names of symbols with any of IDs in the index.
-std::vector lookup(const SymbolIndex &I,
-llvm::ArrayRef IDs) {
-  LookupRequest Req;
-  Req.IDs.insert(IDs.begin(), IDs.end());
-  std::vector Results;
-  I.lookup(Req, [&](const Symbol &Sym) {
-Results.push_back(getQualifiedName(Sym));
-  });
-  return Results;
-}
-
 TEST(MemIndexTest, Lookup) {
   MemIndex I;
   I.build(generateSymbols({"ns::abc", "ns::xyz"}));
@@ -269,7 +191,7 @@
 TEST(MergeTest, Merge) {
   Symbol L, R;
   L.ID = R.ID = SymbolID("hello");
-  L.Name = R.Name = "Foo";// same in both
+  L.Name = R.Name = "Foo";   // same in both
   L.CanonicalDeclaration.FileURI = "file:///left.h"; // differs
   R.CanonicalDeclaration.FileURI = "file:///right.h";
   L.References = 1;
Index: clang-tools-extra/unittests/clangd/IndexHelpers.h
===
--- /dev/null
+++ clang-tools-extra/unittests/clangd/IndexHelpers.h
@@ -0,0 +1,57 @@
+//===-- IndexHelpers.h --*- C++ -*-===//
+//
+// 

[PATCH] D50337: [clangd] DexIndex implementation prototype

2018-08-07 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev added a comment.

As discussed offline, incomplete trigrams (unigrams and bigrams generation) 
should be a blocker for this patch, because otherwise it isn't functional. Once 
incomplete trigrams are in, `MemIndex` tests can be reused for `DexIndex` to 
ensure stability.


https://reviews.llvm.org/D50337



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D50337: [clangd] DexIndex implementation prototype

2018-08-07 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 159515.
kbobyrev added a comment.

Continue implementing Proof of Concept Dex-based static index replacement.

This diff adds short query processing, the current solution does not utilize 
iterators framework (unlike the general queries) yet and is a subject to 
change. As discussed offline, this implementation should lean towards 
simplicity and usability rather then premature optimization.

The patch is still not ready for a comprehensive review yet, these are few 
points which should be addressed before the review is live:

- Code duplication should be reduced as much as possible. `DexIndex` is likely 
to become way more sophisticated than `MemIndex` in the future and hence it 
does not simply inherit or reuse `MemIndex`, this is also a reason why (as 
discussed offline) code duplication in unit tests is not that bad keeping in 
mind that the functionality and implementation of both types of index will 
diverge in the future. However, it's better to abstract out as much as possible 
if the implementation does not become less flexible and cross-dependencies are 
not introduced in the process.
- Slightly cleaning up unit tests (`IndexHelpers.(h|cpp)` is not a very good 
name for the new file used by both `MemIndex` and `DexIndex` testing framework, 
code duplication is also a slight concern)


https://reviews.llvm.org/D50337

Files:
  clang-tools-extra/clangd/CMakeLists.txt
  clang-tools-extra/clangd/index/MemIndex.h
  clang-tools-extra/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/clangd/index/dex/DexIndex.h
  clang-tools-extra/clangd/index/dex/Token.cpp
  clang-tools-extra/clangd/index/dex/Token.h
  clang-tools-extra/unittests/clangd/CMakeLists.txt
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp
  clang-tools-extra/unittests/clangd/IndexHelpers.cpp
  clang-tools-extra/unittests/clangd/IndexHelpers.h
  clang-tools-extra/unittests/clangd/IndexTests.cpp

Index: clang-tools-extra/unittests/clangd/IndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/IndexTests.cpp
+++ clang-tools-extra/unittests/clangd/IndexTests.cpp
@@ -7,33 +7,20 @@
 //
 //===--===//
 
+#include "IndexHelpers.h"
 #include "index/Index.h"
 #include "index/MemIndex.h"
 #include "index/Merge.h"
 #include "gmock/gmock.h"
 #include "gtest/gtest.h"
 
-using testing::UnorderedElementsAre;
 using testing::Pointee;
+using testing::UnorderedElementsAre;
 
 namespace clang {
 namespace clangd {
 namespace {
 
-Symbol symbol(llvm::StringRef QName) {
-  Symbol Sym;
-  Sym.ID = SymbolID(QName.str());
-  size_t Pos = QName.rfind("::");
-  if (Pos == llvm::StringRef::npos) {
-Sym.Name = QName;
-Sym.Scope = "";
-  } else {
-Sym.Name = QName.substr(Pos + 2);
-Sym.Scope = QName.substr(0, Pos + 2);
-  }
-  return Sym;
-}
-
 MATCHER_P(Named, N, "") { return arg.Name == N; }
 
 TEST(SymbolSlab, FindAndIterate) {
@@ -52,59 +39,6 @@
 EXPECT_THAT(*S.find(SymbolID(Sym)), Named(Sym));
 }
 
-struct SlabAndPointers {
-  SymbolSlab Slab;
-  std::vector Pointers;
-};
-
-// Create a slab of symbols with the given qualified names as both IDs and
-// names. The life time of the slab is managed by the returned shared pointer.
-// If \p WeakSymbols is provided, it will be pointed to the managed object in
-// the returned shared pointer.
-std::shared_ptr>
-generateSymbols(std::vector QualifiedNames,
-std::weak_ptr *WeakSymbols = nullptr) {
-  SymbolSlab::Builder Slab;
-  for (llvm::StringRef QName : QualifiedNames)
-Slab.insert(symbol(QName));
-
-  auto Storage = std::make_shared();
-  Storage->Slab = std::move(Slab).build();
-  for (const auto &Sym : Storage->Slab)
-Storage->Pointers.push_back(&Sym);
-  if (WeakSymbols)
-*WeakSymbols = Storage;
-  auto *Pointers = &Storage->Pointers;
-  return {std::move(Storage), Pointers};
-}
-
-// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
-// to the `generateSymbols` above.
-std::shared_ptr>
-generateNumSymbols(int Begin, int End,
-   std::weak_ptr *WeakSymbols = nullptr) {
-  std::vector Names;
-  for (int i = Begin; i <= End; i++)
-Names.push_back(std::to_string(i));
-  return generateSymbols(Names, WeakSymbols);
-}
-
-std::string getQualifiedName(const Symbol &Sym) {
-  return (Sym.Scope + Sym.Name).str();
-}
-
-std::vector match(const SymbolIndex &I,
-   const FuzzyFindRequest &Req,
-   bool *Incomplete = nullptr) {
-  std::vector Matches;
-  bool IsIncomplete = I.fuzzyFind(Req, [&](const Symbol &Sym) {
-Matches.push_back(getQualifiedName(Sym));
-  });
-  if (Incomplete)
-*Incomplete = IsIncomplete;
-  return Matches;
-}
-
 TEST(MemIndexTest, MemIndexSymbolsRecycled) {
   MemIndex I;
   std::weak_ptr Symbols;
@@ -212,18 +146,6 @@
   EXPECT_THAT(match(I, Req), UnorderedElementsAre("ns::A

[PATCH] D50337: [clangd] DexIndex implementation prototype

2018-08-08 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 159658.
kbobyrev added a comment.

Minor code cleanup. This is now a fully functional symbol index.

I have reflected my concerns and uncertainties in `FIXME`s, please indicate if 
you think there's something to improve in this patch. In general, I believe it 
is ready for a review.


https://reviews.llvm.org/D50337

Files:
  clang-tools-extra/clangd/CMakeLists.txt
  clang-tools-extra/clangd/index/MemIndex.h
  clang-tools-extra/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/clangd/index/dex/DexIndex.h
  clang-tools-extra/clangd/index/dex/Token.cpp
  clang-tools-extra/clangd/index/dex/Token.h
  clang-tools-extra/unittests/clangd/CMakeLists.txt
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp
  clang-tools-extra/unittests/clangd/IndexTests.cpp
  clang-tools-extra/unittests/clangd/TestIndexOperations.cpp
  clang-tools-extra/unittests/clangd/TestIndexOperations.h

Index: clang-tools-extra/unittests/clangd/TestIndexOperations.h
===
--- /dev/null
+++ clang-tools-extra/unittests/clangd/TestIndexOperations.h
@@ -0,0 +1,57 @@
+//===-- IndexHelpers.h --*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#ifndef LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_INDEXTESTCOMMON_H
+#define LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_INDEXTESTCOMMON_H
+
+#include "index/Index.h"
+#include "index/Merge.h"
+#include "index/dex/DexIndex.h"
+#include "index/dex/Iterator.h"
+#include "index/dex/Token.h"
+#include "index/dex/Trigram.h"
+
+namespace clang {
+namespace clangd {
+
+Symbol symbol(llvm::StringRef QName);
+
+struct SlabAndPointers {
+  SymbolSlab Slab;
+  std::vector Pointers;
+};
+
+// Create a slab of symbols with the given qualified names as both IDs and
+// names. The life time of the slab is managed by the returned shared pointer.
+// If \p WeakSymbols is provided, it will be pointed to the managed object in
+// the returned shared pointer.
+std::shared_ptr>
+generateSymbols(std::vector QualifiedNames,
+std::weak_ptr *WeakSymbols = nullptr);
+
+// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
+// to the `generateSymbols` above.
+std::shared_ptr>
+generateNumSymbols(int Begin, int End,
+   std::weak_ptr *WeakSymbols = nullptr);
+
+std::string getQualifiedName(const Symbol &Sym);
+
+std::vector match(const SymbolIndex &I,
+   const FuzzyFindRequest &Req,
+   bool *Incomplete = nullptr);
+
+// Returns qualified names of symbols with any of IDs in the index.
+std::vector lookup(const SymbolIndex &I,
+llvm::ArrayRef IDs);
+
+} // namespace clangd
+} // namespace clang
+
+#endif
Index: clang-tools-extra/unittests/clangd/TestIndexOperations.cpp
===
--- /dev/null
+++ clang-tools-extra/unittests/clangd/TestIndexOperations.cpp
@@ -0,0 +1,89 @@
+//===-- IndexHelpers.cpp *- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "IndexHelpers.h"
+
+namespace clang {
+namespace clangd {
+
+Symbol symbol(llvm::StringRef QName) {
+  Symbol Sym;
+  Sym.ID = SymbolID(QName.str());
+  size_t Pos = QName.rfind("::");
+  if (Pos == llvm::StringRef::npos) {
+Sym.Name = QName;
+Sym.Scope = "";
+  } else {
+Sym.Name = QName.substr(Pos + 2);
+Sym.Scope = QName.substr(0, Pos + 2);
+  }
+  return Sym;
+}
+
+// Create a slab of symbols with the given qualified names as both IDs and
+// names. The life time of the slab is managed by the returned shared pointer.
+// If \p WeakSymbols is provided, it will be pointed to the managed object in
+// the returned shared pointer.
+std::shared_ptr>
+generateSymbols(std::vector QualifiedNames,
+std::weak_ptr *WeakSymbols) {
+  SymbolSlab::Builder Slab;
+  for (llvm::StringRef QName : QualifiedNames)
+Slab.insert(symbol(QName));
+
+  auto Storage = std::make_shared();
+  Storage->Slab = std::move(Slab).build();
+  for (const auto &Sym : Storage->Slab)
+Storage->Pointers.push_back(&Sym);
+  if (WeakSymbols)
+*WeakSymbols = Storage;
+  auto *Pointers = &Storage->Pointers;
+  return {std::move(Storage), Pointers};
+}
+
+// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
+// to the `generateSymbols` above.
+std::shared_ptr>
+generateNumSymbols(int Begin, int End,
+   

[PATCH] D50500: [clangd] Allow consuming limited number of items

2018-08-09 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev created this revision.
kbobyrev added reviewers: ioeric, ilya-biryukov.
kbobyrev added a project: clang-tools-extra.
Herald added subscribers: arphaman, jkorous, MaskRay.

This patch modifies `consume` function to allow retrieval of limited number of 
symbols. This is the "cheap" implementation of top-level limiting iterator. In 
the future we would like to have a complete limit iterator implementation to 
insert it into the query subtrees, but in the meantime this version would be 
enough for a fully-functional proof-of-concept Dex implementation.


https://reviews.llvm.org/D50500

Files:
  clang-tools-extra/clangd/index/dex/Iterator.cpp
  clang-tools-extra/clangd/index/dex/Iterator.h


Index: clang-tools-extra/clangd/index/dex/Iterator.h
===
--- clang-tools-extra/clangd/index/dex/Iterator.h
+++ clang-tools-extra/clangd/index/dex/Iterator.h
@@ -101,9 +101,12 @@
   virtual llvm::raw_ostream &dump(llvm::raw_ostream &OS) const = 0;
 };
 
-/// Exhausts given iterator and returns all processed DocIDs. The result
-/// contains sorted DocumentIDs.
-std::vector consume(Iterator &It);
+/// Advances the iterator until it is either exhausted or the number of
+/// requested items is reached. The result contains sorted DocumentIDs. Size of
+/// the returned vector is min(Limit, IteratorSize) where IteratorSize stands
+/// for the number of elements obtained before the iterator is exhausted.
+std::vector consume(Iterator &It,
+   size_t Limit = std::numeric_limits::max());
 
 /// Returns a document iterator over given PostingList.
 std::unique_ptr create(PostingListRef Documents);
Index: clang-tools-extra/clangd/index/dex/Iterator.cpp
===
--- clang-tools-extra/clangd/index/dex/Iterator.cpp
+++ clang-tools-extra/clangd/index/dex/Iterator.cpp
@@ -218,9 +218,10 @@
 
 } // end namespace
 
-std::vector consume(Iterator &It) {
+std::vector consume(Iterator &It, size_t Limit) {
   std::vector Result;
-  for (; !It.reachedEnd(); It.advance())
+  for (size_t Retreived = 0; !It.reachedEnd() && Retreived < Limit;
+   It.advance())
 Result.push_back(It.peek());
   return Result;
 }


Index: clang-tools-extra/clangd/index/dex/Iterator.h
===
--- clang-tools-extra/clangd/index/dex/Iterator.h
+++ clang-tools-extra/clangd/index/dex/Iterator.h
@@ -101,9 +101,12 @@
   virtual llvm::raw_ostream &dump(llvm::raw_ostream &OS) const = 0;
 };
 
-/// Exhausts given iterator and returns all processed DocIDs. The result
-/// contains sorted DocumentIDs.
-std::vector consume(Iterator &It);
+/// Advances the iterator until it is either exhausted or the number of
+/// requested items is reached. The result contains sorted DocumentIDs. Size of
+/// the returned vector is min(Limit, IteratorSize) where IteratorSize stands
+/// for the number of elements obtained before the iterator is exhausted.
+std::vector consume(Iterator &It,
+   size_t Limit = std::numeric_limits::max());
 
 /// Returns a document iterator over given PostingList.
 std::unique_ptr create(PostingListRef Documents);
Index: clang-tools-extra/clangd/index/dex/Iterator.cpp
===
--- clang-tools-extra/clangd/index/dex/Iterator.cpp
+++ clang-tools-extra/clangd/index/dex/Iterator.cpp
@@ -218,9 +218,10 @@
 
 } // end namespace
 
-std::vector consume(Iterator &It) {
+std::vector consume(Iterator &It, size_t Limit) {
   std::vector Result;
-  for (; !It.reachedEnd(); It.advance())
+  for (size_t Retreived = 0; !It.reachedEnd() && Retreived < Limit;
+   It.advance())
 Result.push_back(It.peek());
   return Result;
 }
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D50337: [clangd] DexIndex implementation prototype

2018-08-09 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 159908.
kbobyrev marked 15 inline comments as done.
kbobyrev added a comment.

Address a round of comments. Also put `FIXME`s where appropriate for the future 
changes.


https://reviews.llvm.org/D50337

Files:
  clang-tools-extra/clangd/CMakeLists.txt
  clang-tools-extra/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/clangd/index/dex/DexIndex.h
  clang-tools-extra/clangd/index/dex/Token.cpp
  clang-tools-extra/clangd/index/dex/Token.h
  clang-tools-extra/unittests/clangd/CMakeLists.txt
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp
  clang-tools-extra/unittests/clangd/IndexTests.cpp
  clang-tools-extra/unittests/clangd/TestIndexOperations.cpp
  clang-tools-extra/unittests/clangd/TestIndexOperations.h

Index: clang-tools-extra/unittests/clangd/TestIndexOperations.h
===
--- /dev/null
+++ clang-tools-extra/unittests/clangd/TestIndexOperations.h
@@ -0,0 +1,57 @@
+//===-- IndexHelpers.h --*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#ifndef LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_INDEXTESTCOMMON_H
+#define LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_INDEXTESTCOMMON_H
+
+#include "index/Index.h"
+#include "index/Merge.h"
+#include "index/dex/DexIndex.h"
+#include "index/dex/Iterator.h"
+#include "index/dex/Token.h"
+#include "index/dex/Trigram.h"
+
+namespace clang {
+namespace clangd {
+
+Symbol symbol(llvm::StringRef QName);
+
+struct SlabAndPointers {
+  SymbolSlab Slab;
+  std::vector Pointers;
+};
+
+// Create a slab of symbols with the given qualified names as both IDs and
+// names. The life time of the slab is managed by the returned shared pointer.
+// If \p WeakSymbols is provided, it will be pointed to the managed object in
+// the returned shared pointer.
+std::shared_ptr>
+generateSymbols(std::vector QualifiedNames,
+std::weak_ptr *WeakSymbols = nullptr);
+
+// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
+// to the `generateSymbols` above.
+std::shared_ptr>
+generateNumSymbols(int Begin, int End,
+   std::weak_ptr *WeakSymbols = nullptr);
+
+std::string getQualifiedName(const Symbol &Sym);
+
+std::vector match(const SymbolIndex &I,
+   const FuzzyFindRequest &Req,
+   bool *Incomplete = nullptr);
+
+// Returns qualified names of symbols with any of IDs in the index.
+std::vector lookup(const SymbolIndex &I,
+llvm::ArrayRef IDs);
+
+} // namespace clangd
+} // namespace clang
+
+#endif
Index: clang-tools-extra/unittests/clangd/TestIndexOperations.cpp
===
--- /dev/null
+++ clang-tools-extra/unittests/clangd/TestIndexOperations.cpp
@@ -0,0 +1,89 @@
+//===-- IndexHelpers.cpp *- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "TestIndexOperations.h"
+
+namespace clang {
+namespace clangd {
+
+Symbol symbol(llvm::StringRef QName) {
+  Symbol Sym;
+  Sym.ID = SymbolID(QName.str());
+  size_t Pos = QName.rfind("::");
+  if (Pos == llvm::StringRef::npos) {
+Sym.Name = QName;
+Sym.Scope = "";
+  } else {
+Sym.Name = QName.substr(Pos + 2);
+Sym.Scope = QName.substr(0, Pos + 2);
+  }
+  return Sym;
+}
+
+// Create a slab of symbols with the given qualified names as both IDs and
+// names. The life time of the slab is managed by the returned shared pointer.
+// If \p WeakSymbols is provided, it will be pointed to the managed object in
+// the returned shared pointer.
+std::shared_ptr>
+generateSymbols(std::vector QualifiedNames,
+std::weak_ptr *WeakSymbols) {
+  SymbolSlab::Builder Slab;
+  for (llvm::StringRef QName : QualifiedNames)
+Slab.insert(symbol(QName));
+
+  auto Storage = std::make_shared();
+  Storage->Slab = std::move(Slab).build();
+  for (const auto &Sym : Storage->Slab)
+Storage->Pointers.push_back(&Sym);
+  if (WeakSymbols)
+*WeakSymbols = Storage;
+  auto *Pointers = &Storage->Pointers;
+  return {std::move(Storage), Pointers};
+}
+
+// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
+// to the `generateSymbols` above.
+std::shared_ptr>
+generateNumSymbols(int Begin, int End,
+   std::weak_ptr *WeakSymbols) {
+  std::vector Names;
+  for (int i = Begin; i <= End; i++)
+Names.push_back(std::to_string(i));
+  return generateSym

[PATCH] D50517: [clangd] Generate incomplete trigrams for the Dex index

2018-08-09 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev created this revision.
kbobyrev added reviewers: ioeric, ilya-biryukov.
Herald added subscribers: arphaman, jkorous, MaskRay.

https://reviews.llvm.org/D50517

Files:
  clang-tools-extra/clangd/index/dex/Trigram.cpp
  clang-tools-extra/clangd/index/dex/Trigram.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp

Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -250,30 +250,44 @@
 }
 
 TEST(DexIndexTrigrams, IdentifierTrigrams) {
-  EXPECT_THAT(generateIdentifierTrigrams("X86"), trigramsAre({"x86"}));
+  EXPECT_THAT(generateIdentifierTrigrams("X86"),
+  trigramsAre({"x86", "x$$", "8$$", "6$$", "x8$", "86$"}));
 
-  EXPECT_THAT(generateIdentifierTrigrams("nl"), trigramsAre({}));
+  EXPECT_THAT(generateIdentifierTrigrams("nl"),
+  trigramsAre({"nl$", "n$$", "l$$"}));
 
-  EXPECT_THAT(generateIdentifierTrigrams("clangd"),
-  trigramsAre({"cla", "lan", "ang", "ngd"}));
-
-  EXPECT_THAT(generateIdentifierTrigrams("abc_def"),
-  trigramsAre({"abc", "abd", "ade", "bcd", "bde", "cde", "def"}));
+  EXPECT_THAT(generateIdentifierTrigrams("n"), trigramsAre({"n$$"}));
 
   EXPECT_THAT(
-  generateIdentifierTrigrams("a_b_c_d_e_"),
-  trigramsAre({"abc", "abd", "acd", "ace", "bcd", "bce", "bde", "cde"}));
+  generateIdentifierTrigrams("clangd"),
+  trigramsAre({"cla", "lan", "ang", "ngd", "an$", "n$$", "g$$", "cl$",
+   "ng$", "d$$", "l$$", "a$$", "c$$", "gd$", "la$"}));
 
-  EXPECT_THAT(
-  generateIdentifierTrigrams("unique_ptr"),
-  trigramsAre({"uni", "unp", "upt", "niq", "nip", "npt", "iqu", "iqp",
-   "ipt", "que", "qup", "qpt", "uep", "ept", "ptr"}));
+  EXPECT_THAT(generateIdentifierTrigrams("abc_def"),
+  trigramsAre({"abc", "abd", "ade", "bcd", "bde", "cde", "def",
+   "a$$", "b$$", "c$$", "d$$", "e$$", "f$$", "ab$",
+   "ad$", "bc$", "bd$", "cd$", "de$", "ef$"}));
+
+  EXPECT_THAT(generateIdentifierTrigrams("a_b_c_d_e_"),
+  trigramsAre({"abc", "abd", "acd", "ace", "bcd", "bce", "bde",
+   "cde", "a$$", "ab$", "ac$", "b$$", "bc$", "bd$",
+   "c$$", "cd$", "ce$", "d$$", "de$", "e$$"}));
+
+  EXPECT_THAT(generateIdentifierTrigrams("unique_ptr"),
+  trigramsAre({"uni", "unp", "upt", "niq", "nip", "npt", "iqu",
+   "iqp", "ipt", "que", "qup", "qpt", "uep", "ept",
+   "ptr", "u$$", "un$", "up$", "n$$", "ni$", "np$",
+   "i$$", "iq$", "ip$", "q$$", "qu$", "qp$", "ue$",
+   "e$$", "ep$", "p$$", "pt$", "t$$", "tr$", "r$$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("TUDecl"),
-  trigramsAre({"tud", "tde", "ude", "dec", "ecl"}));
+  trigramsAre({"tud", "tde", "ude", "dec", "ecl", "t$$", "tu$",
+   "td$", "u$$", "ud$", "d$$", "de$", "e$$", "ec$",
+   "c$$", "cl$", "l$$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("IsOK"),
-  trigramsAre({"iso", "iok", "sok"}));
+  trigramsAre({"iso", "iok", "sok", "i$$", "is$", "io$", "s$$",
+   "so$", "o$$", "ok$", "k$$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("abc_defGhij__klm"),
   trigramsAre({
Index: clang-tools-extra/clangd/index/dex/Trigram.h
===
--- clang-tools-extra/clangd/index/dex/Trigram.h
+++ clang-tools-extra/clangd/index/dex/Trigram.h
@@ -31,6 +31,11 @@
 namespace clangd {
 namespace dex {
 
+/// This is used to mark unigrams and bigrams and distinct them from complete
+/// trigrams. Since '$' is not present in valid identifier names, it is safe to
+/// use it as the special symbol.
+const auto END_SYMBOL = '$';
+
 /// Returns list of unique fuzzy-search trigrams from unqualified symbol.
 ///
 /// First, given Identifier (unqualified symbol name) is segmented using
Index: clang-tools-extra/clangd/index/dex/Trigram.cpp
===
--- clang-tools-extra/clangd/index/dex/Trigram.cpp
+++ clang-tools-extra/clangd/index/dex/Trigram.cpp
@@ -10,11 +10,9 @@
 #include "Trigram.h"
 #include "../../FuzzyMatch.h"
 #include "Token.h"
-
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/StringExtras.h"
-
 #include 
 #include 
 #include 
@@ -59,15 +57,31 @@
 }
   }
 
+  // Iterate through valid seqneces of three characters Fuzzy Matcher can
+  // process.
   DenseSet UniqueTrigrams;
   std::array Chars;
   for (size_t I = 0; I < LowercaseIdentifier.size(); ++I) {
 // Skip delimiters.
 if (Roles[I] != Head && Roles[I] != Tail)
   continue;
+
+   

[PATCH] D50517: [clangd] Generate incomplete trigrams for the Dex index

2018-08-09 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev planned changes to this revision.
kbobyrev added a comment.

This patch is in preview mode and can be useful for the discussion. It's not 
functional yet, but this will be changed in the future.

The upcoming changes would allow handling short queries introduced in 
https://reviews.llvm.org/D50337 in a more efficient manner.

@ioeric proposed to generate unigrams for the first letter of the identifier so 
that the index would only perform prefix match for one-letter completion 
requests, which I think would be a great performance improvement.


https://reviews.llvm.org/D50517



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D50500: [clangd] Allow consuming limited number of items

2018-08-09 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev planned changes to this revision.
kbobyrev added a comment.

Oops, I thought I pushed "Plan Changes" for this one.


https://reviews.llvm.org/D50500



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D50500: [clangd] Allow consuming limited number of items

2018-08-09 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 159959.
kbobyrev marked 3 inline comments as done.
kbobyrev added a comment.

Fix the implementation and add test coverage.


https://reviews.llvm.org/D50500

Files:
  clang-tools-extra/clangd/index/dex/Iterator.cpp
  clang-tools-extra/clangd/index/dex/Iterator.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp


Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -240,6 +240,29 @@
 "(& (& [1, 3, 5, 8, 9] [1, 5, 7, 9]) (| [0, 5] [0, 1, 5] []))");
 }
 
+TEST(DexIndexIterators, Limit) {
+  const PostingList L0 = {4, 7, 8, 20, 42, 100};
+  const PostingList L1 = {1, 3, 5, 8, 9};
+  const PostingList L2 = {1, 5, 7, 9};
+  const PostingList L3 = {0, 5};
+  const PostingList L4 = {0, 1, 5};
+  const PostingList L5;
+
+  auto DocIterator = create(L0);
+  EXPECT_THAT(consume(*DocIterator, 42), ElementsAre(4, 7, 8, 20, 42, 100));
+
+  DocIterator = create(L0);
+  EXPECT_THAT(consume(*DocIterator, 3), ElementsAre(4, 7, 8));
+
+  DocIterator = create(L0);
+  EXPECT_THAT(consume(*DocIterator, 0), ElementsAre());
+
+  auto Root = createAnd(createAnd(create(L1), create(L2)),
+createOr(create(L3), create(L4), create(L5)));
+
+  EXPECT_THAT(consume(*Root, 42), ElementsAre(1, 5));
+}
+
 testing::Matcher>
 trigramsAre(std::initializer_list Trigrams) {
   std::vector Tokens;
Index: clang-tools-extra/clangd/index/dex/Iterator.h
===
--- clang-tools-extra/clangd/index/dex/Iterator.h
+++ clang-tools-extra/clangd/index/dex/Iterator.h
@@ -101,9 +101,10 @@
   virtual llvm::raw_ostream &dump(llvm::raw_ostream &OS) const = 0;
 };
 
-/// Exhausts given iterator and returns all processed DocIDs. The result
-/// contains sorted DocumentIDs.
-std::vector consume(Iterator &It);
+/// Advances the iterator until it is either exhausted or the number of
+/// requested items is reached. The result contains sorted DocumentIDs.
+std::vector consume(Iterator &It,
+   size_t Limit = std::numeric_limits::max());
 
 /// Returns a document iterator over given PostingList.
 std::unique_ptr create(PostingListRef Documents);
Index: clang-tools-extra/clangd/index/dex/Iterator.cpp
===
--- clang-tools-extra/clangd/index/dex/Iterator.cpp
+++ clang-tools-extra/clangd/index/dex/Iterator.cpp
@@ -218,9 +218,10 @@
 
 } // end namespace
 
-std::vector consume(Iterator &It) {
+std::vector consume(Iterator &It, size_t Limit) {
   std::vector Result;
-  for (; !It.reachedEnd(); It.advance())
+  for (size_t Retrieved = 0; !It.reachedEnd() && Retrieved < Limit;
+   It.advance(), ++Retrieved)
 Result.push_back(It.peek());
   return Result;
 }


Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -240,6 +240,29 @@
 "(& (& [1, 3, 5, 8, 9] [1, 5, 7, 9]) (| [0, 5] [0, 1, 5] []))");
 }
 
+TEST(DexIndexIterators, Limit) {
+  const PostingList L0 = {4, 7, 8, 20, 42, 100};
+  const PostingList L1 = {1, 3, 5, 8, 9};
+  const PostingList L2 = {1, 5, 7, 9};
+  const PostingList L3 = {0, 5};
+  const PostingList L4 = {0, 1, 5};
+  const PostingList L5;
+
+  auto DocIterator = create(L0);
+  EXPECT_THAT(consume(*DocIterator, 42), ElementsAre(4, 7, 8, 20, 42, 100));
+
+  DocIterator = create(L0);
+  EXPECT_THAT(consume(*DocIterator, 3), ElementsAre(4, 7, 8));
+
+  DocIterator = create(L0);
+  EXPECT_THAT(consume(*DocIterator, 0), ElementsAre());
+
+  auto Root = createAnd(createAnd(create(L1), create(L2)),
+createOr(create(L3), create(L4), create(L5)));
+
+  EXPECT_THAT(consume(*Root, 42), ElementsAre(1, 5));
+}
+
 testing::Matcher>
 trigramsAre(std::initializer_list Trigrams) {
   std::vector Tokens;
Index: clang-tools-extra/clangd/index/dex/Iterator.h
===
--- clang-tools-extra/clangd/index/dex/Iterator.h
+++ clang-tools-extra/clangd/index/dex/Iterator.h
@@ -101,9 +101,10 @@
   virtual llvm::raw_ostream &dump(llvm::raw_ostream &OS) const = 0;
 };
 
-/// Exhausts given iterator and returns all processed DocIDs. The result
-/// contains sorted DocumentIDs.
-std::vector consume(Iterator &It);
+/// Advances the iterator until it is either exhausted or the number of
+/// requested items is reached. The result contains sorted DocumentIDs.
+std::vector consume(Iterator &It,
+   size_t Limit = std::numeric_limits::max());
 
 /// Returns a document iterator over given PostingList.
 std::unique_ptr create(PostingListRef Documents);
Index: clang-tools-extr

[PATCH] D50517: [clangd] Generate incomplete trigrams for the Dex index

2018-08-10 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 160071.
kbobyrev added a comment.

Complete the tests, finish the implementation.

One thought about prefix match suggestion: we should either make it more 
explicit for the index (e.g. introduce `prefixMatch` and dispatch `fuzzyMatch` 
to prefix matching in case query only contains one "true" symbol) or document 
this properly. While, as I wrote earlier, I totally support the idea of prefix 
matching queries of length 1 it might not align with some user expectations and 
it's also very implicit if we just generate tokens this way and don't mention 
it anywhere in the `DexIndex` implementation.

@ioeric, @ilya-biryukov any thoughts?


https://reviews.llvm.org/D50517

Files:
  clang-tools-extra/clangd/index/dex/Trigram.cpp
  clang-tools-extra/clangd/index/dex/Trigram.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp

Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -250,45 +250,60 @@
 }
 
 TEST(DexIndexTrigrams, IdentifierTrigrams) {
-  EXPECT_THAT(generateIdentifierTrigrams("X86"), trigramsAre({"x86"}));
+  EXPECT_THAT(generateIdentifierTrigrams("X86"),
+  trigramsAre({"x86", "x$$", "x8$", "86$"}));
 
-  EXPECT_THAT(generateIdentifierTrigrams("nl"), trigramsAre({}));
+  EXPECT_THAT(generateIdentifierTrigrams("nl"), trigramsAre({"nl$", "n$$"}));
+
+  EXPECT_THAT(generateIdentifierTrigrams("n"), trigramsAre({"n$$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("clangd"),
-  trigramsAre({"cla", "lan", "ang", "ngd"}));
+  trigramsAre({"cla", "lan", "ang", "ngd", "an$", "c$$", "cl$",
+   "ng$", "gd$", "la$"}));
 
-  EXPECT_THAT(generateIdentifierTrigrams("abc_def"),
-  trigramsAre({"abc", "abd", "ade", "bcd", "bde", "cde", "def"}));
+  EXPECT_THAT(
+  generateIdentifierTrigrams("abc_def"),
+  trigramsAre({"abc", "abd", "ade", "bcd", "bde", "cde", "def", "a$$",
+   "ab$", "ad$", "bc$", "bd$", "cd$", "de$", "ef$"}));
 
   EXPECT_THAT(
   generateIdentifierTrigrams("a_b_c_d_e_"),
-  trigramsAre({"abc", "abd", "acd", "ace", "bcd", "bce", "bde", "cde"}));
+  trigramsAre({"abc", "abd", "acd", "ace", "bcd", "bce", "bde", "cde",
+   "a$$", "ab$", "ac$", "bc$", "bd$", "cd$", "ce$", "de$"}));
 
-  EXPECT_THAT(
-  generateIdentifierTrigrams("unique_ptr"),
-  trigramsAre({"uni", "unp", "upt", "niq", "nip", "npt", "iqu", "iqp",
-   "ipt", "que", "qup", "qpt", "uep", "ept", "ptr"}));
+  EXPECT_THAT(generateIdentifierTrigrams("unique_ptr"),
+  trigramsAre({"uni", "unp", "upt", "niq", "nip", "npt", "iqu",
+   "iqp", "ipt", "que", "qup", "qpt", "uep", "ept",
+   "ptr", "u$$", "un$", "up$", "ni$", "np$", "iq$",
+   "ip$", "qu$", "qp$", "ue$", "ep$", "pt$", "tr$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("TUDecl"),
-  trigramsAre({"tud", "tde", "ude", "dec", "ecl"}));
-
-  EXPECT_THAT(generateIdentifierTrigrams("IsOK"),
-  trigramsAre({"iso", "iok", "sok"}));
-
-  EXPECT_THAT(generateIdentifierTrigrams("abc_defGhij__klm"),
-  trigramsAre({
-  "abc", "abd", "abg", "ade", "adg", "adk", "agh", "agk", "bcd",
-  "bcg", "bde", "bdg", "bdk", "bgh", "bgk", "cde", "cdg", "cdk",
-  "cgh", "cgk", "def", "deg", "dek", "dgh", "dgk", "dkl", "efg",
-  "efk", "egh", "egk", "ekl", "fgh", "fgk", "fkl", "ghi", "ghk",
-  "gkl", "hij", "hik", "hkl", "ijk", "ikl", "jkl", "klm",
-  }));
+  trigramsAre({"tud", "tde", "ude", "dec", "ecl", "t$$", "tu$",
+   "td$", "ud$", "de$", "ec$", "cl$"}));
+
+  EXPECT_THAT(
+  generateIdentifierTrigrams("IsOK"),
+  trigramsAre({"iso", "iok", "sok", "i$$", "is$", "io$", "so$", "ok$"}));
+
+  EXPECT_THAT(
+  generateIdentifierTrigrams("abc_defGhij__klm"),
+  trigramsAre({"a$$", "abc", "abd", "abg", "ade", "adg", "adk", "agh",
+   "agk", "bcd", "bcg", "bde", "bdg", "bdk", "bgh", "bgk",
+   "cde", "cdg", "cdk", "cgh", "cgk", "def", "deg", "dek",
+   "dgh", "dgk", "dkl", "efg", "efk", "egh", "egk", "ekl",
+   "fgh", "fgk", "fkl", "ghi", "ghk", "gkl", "hij", "hik",
+   "hkl", "ijk", "ikl", "jkl", "klm", "ab$", "ad$", "ag$",
+   "bc$", "bd$", "bg$", "cd$", "cg$", "de$", "dg$", "dk$",
+   "ef$", "eg$", "ek$", "fg$", "fk$", "gh$", "gk$", "hi$",
+   "hk$", "ij$", "ik$", "jk$", "kl$", "lm$"}));
 }
 
 TEST(DexIndexTrigrams, QueryTrigrams) {
-  EXPECT_THAT(generateQueryTrigrams("X86"), trigramsAre({"x86"}));
+  EXPECT_THAT(generateQueryTrigrams("c"), trigr

[PATCH] D50517: [clangd] Generate incomplete trigrams for the Dex index

2018-08-10 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev planned changes to this revision.
kbobyrev added a comment.

As discussed offline with @ilya-biryukov, the better approach would be to 
prefix match first symbols of each distinct identifier piece instead of prefix 
matching (just looking at the first letter of the identifier) the whole 
identifier.

Example:

- Query: `"u"`
- Symbols: `"unique_ptr"`, `"user"`, `"super_user"`

Current implementation would match `"unique_ptr"` and `"user"` only.
Proposed implementation would match all three symbols, because the second piece 
of `"super_user"` starts with `u`.

This might be useful for codebases where e.g. each identifier starts with some 
project prefix (`ProjectInstruction`, `ProjectGraph`, etc). For C++, it's 
better to use namespaces instead of this naming which is not really great, but 
I am aware of the C++ projects which actually opt for such naming convention. 
However, in pure C this relatively common practice, e.g. a typical piece of 
code for GNOME might be

GtkOrientationorientation;
GtkWrapAllocationMode mode;
  
GtkWrapBoxSpreading   horizontal_spreading;
GtkWrapBoxSpreading   vertical_spreading;
  
guint16   vertical_spacing;
guint16   horizontal_spacing;
  
guint16   minimum_line_children;
guint16   natural_line_children;
  
GList*children;
  };

Also, this is better for macros, which can not be put into namespaces anyway 
and there's `BENCHMARK_UNREACHABLE` and so on.

I'll update the patch with the proposed solution.


https://reviews.llvm.org/D50517



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D50517: [clangd] Generate incomplete trigrams for the Dex index

2018-08-10 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 160074.
kbobyrev added a comment.

@ilya-biryukov I have changed the approach to the one we discussed before.


https://reviews.llvm.org/D50517

Files:
  clang-tools-extra/clangd/index/dex/Trigram.cpp
  clang-tools-extra/clangd/index/dex/Trigram.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp

Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -250,45 +250,61 @@
 }
 
 TEST(DexIndexTrigrams, IdentifierTrigrams) {
-  EXPECT_THAT(generateIdentifierTrigrams("X86"), trigramsAre({"x86"}));
+  EXPECT_THAT(generateIdentifierTrigrams("X86"),
+  trigramsAre({"x86", "x$$", "x8$", "86$"}));
 
-  EXPECT_THAT(generateIdentifierTrigrams("nl"), trigramsAre({}));
+  EXPECT_THAT(generateIdentifierTrigrams("nl"), trigramsAre({"nl$", "n$$"}));
 
-  EXPECT_THAT(generateIdentifierTrigrams("clangd"),
-  trigramsAre({"cla", "lan", "ang", "ngd"}));
+  EXPECT_THAT(generateIdentifierTrigrams("n"), trigramsAre({"n$$"}));
 
-  EXPECT_THAT(generateIdentifierTrigrams("abc_def"),
-  trigramsAre({"abc", "abd", "ade", "bcd", "bde", "cde", "def"}));
+  EXPECT_THAT(generateIdentifierTrigrams("clangd"),
+  trigramsAre({"cla", "lan", "ang", "ngd", "an$", "c$$", "cl$",
+   "ng$", "gd$", "la$"}));
 
   EXPECT_THAT(
-  generateIdentifierTrigrams("a_b_c_d_e_"),
-  trigramsAre({"abc", "abd", "acd", "ace", "bcd", "bce", "bde", "cde"}));
+  generateIdentifierTrigrams("abc_def"),
+  trigramsAre({"a$$", "d$$", "abc", "abd", "ade", "bcd", "bde", "cde",
+   "def", "ab$", "ad$", "bc$", "bd$", "cd$", "de$", "ef$"}));
+
+  EXPECT_THAT(generateIdentifierTrigrams("a_b_c_d_e_"),
+  trigramsAre({"a$$", "b$$", "c$$", "d$$", "e$$", "abc", "abd",
+   "acd", "ace", "bcd", "bce", "bde", "cde", "ab$",
+   "ac$", "bc$", "bd$", "cd$", "ce$", "de$"}));
 
   EXPECT_THAT(
   generateIdentifierTrigrams("unique_ptr"),
-  trigramsAre({"uni", "unp", "upt", "niq", "nip", "npt", "iqu", "iqp",
-   "ipt", "que", "qup", "qpt", "uep", "ept", "ptr"}));
+  trigramsAre({"u$$", "p$$", "uni", "unp", "upt", "niq", "nip", "npt",
+   "iqu", "iqp", "ipt", "que", "qup", "qpt", "uep", "ept",
+   "ptr", "un$", "up$", "ni$", "np$", "iq$", "ip$", "qu$",
+   "qp$", "ue$", "ep$", "pt$", "tr$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("TUDecl"),
-  trigramsAre({"tud", "tde", "ude", "dec", "ecl"}));
+  trigramsAre({"t$$", "d$$", "tud", "tde", "ude", "dec", "ecl",
+   "tu$", "td$", "ud$", "de$", "ec$", "cl$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("IsOK"),
-  trigramsAre({"iso", "iok", "sok"}));
-
-  EXPECT_THAT(generateIdentifierTrigrams("abc_defGhij__klm"),
-  trigramsAre({
-  "abc", "abd", "abg", "ade", "adg", "adk", "agh", "agk", "bcd",
-  "bcg", "bde", "bdg", "bdk", "bgh", "bgk", "cde", "cdg", "cdk",
-  "cgh", "cgk", "def", "deg", "dek", "dgh", "dgk", "dkl", "efg",
-  "efk", "egh", "egk", "ekl", "fgh", "fgk", "fkl", "ghi", "ghk",
-  "gkl", "hij", "hik", "hkl", "ijk", "ikl", "jkl", "klm",
-  }));
+  trigramsAre({"i$$", "o$$", "iso", "iok", "sok", "is$", "io$",
+   "so$", "ok$"}));
+
+  EXPECT_THAT(
+  generateIdentifierTrigrams("abc_defGhij__klm"),
+  trigramsAre(
+  {"a$$", "d$$", "g$$", "k$$", "abc", "abd", "abg", "ade", "adg", "adk",
+   "agh", "agk", "bcd", "bcg", "bde", "bdg", "bdk", "bgh", "bgk", "cde",
+   "cdg", "cdk", "cgh", "cgk", "def", "deg", "dek", "dgh", "dgk", "dkl",
+   "efg", "efk", "egh", "egk", "ekl", "fgh", "fgk", "fkl", "ghi", "ghk",
+   "gkl", "hij", "hik", "hkl", "ijk", "ikl", "jkl", "klm", "ab$", "ad$",
+   "ag$", "bc$", "bd$", "bg$", "cd$", "cg$", "de$", "dg$", "dk$", "ef$",
+   "eg$", "ek$", "fg$", "fk$", "gh$", "gk$", "hi$", "hk$", "ij$", "ik$",
+   "jk$", "kl$", "lm$"}));
 }
 
 TEST(DexIndexTrigrams, QueryTrigrams) {
-  EXPECT_THAT(generateQueryTrigrams("X86"), trigramsAre({"x86"}));
+  EXPECT_THAT(generateQueryTrigrams("c"), trigramsAre({"c$$"}));
+  EXPECT_THAT(generateQueryTrigrams("cl"), trigramsAre({"cl$"}));
+  EXPECT_THAT(generateQueryTrigrams("cla"), trigramsAre({"cla"}));
 
-  EXPECT_THAT(generateQueryTrigrams("nl"), trigramsAre({}));
+  EXPECT_THAT(generateQueryTrigrams("X86"), trigramsAre({"x86"}));
 
   EXPECT_THAT(generateQueryTrigrams("clangd"),
   trigramsAre({"cla", "lan", "ang", "ngd"}));
Index: clang-tools-extra/clangd/index/dex/Trigram.h

[PATCH] D50517: [clangd] Generate incomplete trigrams for the Dex index

2018-08-10 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 160081.
kbobyrev marked 5 inline comments as done.
kbobyrev added a comment.

Address a round of comments.

I have added few comments to get additional feedback before further changes are 
made.


https://reviews.llvm.org/D50517

Files:
  clang-tools-extra/clangd/index/dex/Iterator.h
  clang-tools-extra/clangd/index/dex/Trigram.cpp
  clang-tools-extra/clangd/index/dex/Trigram.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp

Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -250,45 +250,55 @@
 }
 
 TEST(DexIndexTrigrams, IdentifierTrigrams) {
-  EXPECT_THAT(generateIdentifierTrigrams("X86"), trigramsAre({"x86"}));
+  EXPECT_THAT(generateIdentifierTrigrams("X86"),
+  trigramsAre({"x86", "x$$", "x8$"}));
 
-  EXPECT_THAT(generateIdentifierTrigrams("nl"), trigramsAre({}));
+  EXPECT_THAT(generateIdentifierTrigrams("nl"), trigramsAre({"nl$", "n$$"}));
+
+  EXPECT_THAT(generateIdentifierTrigrams("n"), trigramsAre({"n$$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("clangd"),
-  trigramsAre({"cla", "lan", "ang", "ngd"}));
+  trigramsAre({"c$$", "cl$", "cla", "lan", "ang", "ngd"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("abc_def"),
-  trigramsAre({"abc", "abd", "ade", "bcd", "bde", "cde", "def"}));
+  trigramsAre({"a$$", "d$$", "abc", "abd", "ade", "bcd", "bde",
+   "cde", "def", "ab$", "ad$", "de$"}));
 
-  EXPECT_THAT(
-  generateIdentifierTrigrams("a_b_c_d_e_"),
-  trigramsAre({"abc", "abd", "acd", "ace", "bcd", "bce", "bde", "cde"}));
+  EXPECT_THAT(generateIdentifierTrigrams("a_b_c_d_e_"),
+  trigramsAre({"a$$", "b$$", "c$$", "d$$", "e$$", "abc", "abd",
+   "acd", "ace", "bcd", "bce", "bde", "cde", "ab$",
+   "ac$", "bc$", "bd$", "cd$", "ce$", "de$"}));
 
-  EXPECT_THAT(
-  generateIdentifierTrigrams("unique_ptr"),
-  trigramsAre({"uni", "unp", "upt", "niq", "nip", "npt", "iqu", "iqp",
-   "ipt", "que", "qup", "qpt", "uep", "ept", "ptr"}));
+  EXPECT_THAT(generateIdentifierTrigrams("unique_ptr"),
+  trigramsAre({"u$$", "p$$", "uni", "unp", "upt", "niq", "nip",
+   "npt", "iqu", "iqp", "ipt", "que", "qup", "qpt",
+   "uep", "ept", "ptr", "un$", "up$", "pt$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("TUDecl"),
-  trigramsAre({"tud", "tde", "ude", "dec", "ecl"}));
-
-  EXPECT_THAT(generateIdentifierTrigrams("IsOK"),
-  trigramsAre({"iso", "iok", "sok"}));
-
-  EXPECT_THAT(generateIdentifierTrigrams("abc_defGhij__klm"),
-  trigramsAre({
-  "abc", "abd", "abg", "ade", "adg", "adk", "agh", "agk", "bcd",
-  "bcg", "bde", "bdg", "bdk", "bgh", "bgk", "cde", "cdg", "cdk",
-  "cgh", "cgk", "def", "deg", "dek", "dgh", "dgk", "dkl", "efg",
-  "efk", "egh", "egk", "ekl", "fgh", "fgk", "fkl", "ghi", "ghk",
-  "gkl", "hij", "hik", "hkl", "ijk", "ikl", "jkl", "klm",
-  }));
+  trigramsAre({"t$$", "d$$", "tud", "tde", "ude", "dec", "ecl",
+   "tu$", "td$", "de$"}));
+
+  EXPECT_THAT(
+  generateIdentifierTrigrams("IsOK"),
+  trigramsAre({"i$$", "o$$", "iso", "iok", "sok", "is$", "io$", "ok$"}));
+
+  EXPECT_THAT(
+  generateIdentifierTrigrams("abc_defGhij__klm"),
+  trigramsAre(
+  {"a$$", "d$$", "g$$", "k$$", "abc", "abd", "abg", "ade", "adg", "adk",
+   "agh", "agk", "bcd", "bcg", "bde", "bdg", "bdk", "bgh", "bgk", "cde",
+   "cdg", "cdk", "cgh", "cgk", "def", "deg", "dek", "dgh", "dgk", "dkl",
+   "efg", "efk", "egh", "egk", "ekl", "fgh", "fgk", "fkl", "ghi", "ghk",
+   "gkl", "hij", "hik", "hkl", "ijk", "ikl", "jkl", "klm", "ab$", "ad$",
+   "ag$", "de$", "dg$", "dk$", "gh$", "gk$", "kl$"}));
 }
 
 TEST(DexIndexTrigrams, QueryTrigrams) {
-  EXPECT_THAT(generateQueryTrigrams("X86"), trigramsAre({"x86"}));
+  EXPECT_THAT(generateQueryTrigrams("c"), trigramsAre({"c$$"}));
+  EXPECT_THAT(generateQueryTrigrams("cl"), trigramsAre({"cl$"}));
+  EXPECT_THAT(generateQueryTrigrams("cla"), trigramsAre({"cla"}));
 
-  EXPECT_THAT(generateQueryTrigrams("nl"), trigramsAre({}));
+  EXPECT_THAT(generateQueryTrigrams("X86"), trigramsAre({"x86"}));
 
   EXPECT_THAT(generateQueryTrigrams("clangd"),
   trigramsAre({"cla", "lan", "ang", "ngd"}));
Index: clang-tools-extra/clangd/index/dex/Trigram.h
===
--- clang-tools-extra/clangd/index/dex/Trigram.h
+++ clang-tools-extra/clangd/index/dex/Trigram.h
@@ -44,6 +44,14 @@
 /// to the next character, move to the start

[PATCH] D50517: [clangd] Generate incomplete trigrams for the Dex index

2018-08-10 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev added inline comments.



Comment at: clang-tools-extra/clangd/index/dex/Trigram.cpp:74
+// symbol of the identifier.
+if (!FoundFirstSymbol) {
+  FoundFirstSymbol = true;

ioeric wrote:
> Could this be pulled out of the loop? I think what we want is just 
> `LowercaseIdentifier[0]` right?
> 
> I'd probably also pulled that into a function, as the function body is 
> getting larger.
Same as elsewhere, if we have `__builtin_whatever` the it's not actually the 
first symbol of the lowercase identifier.



Comment at: clang-tools-extra/clangd/index/dex/Trigram.cpp:87
+
+  Chars = {{LowercaseIdentifier[I], LowercaseIdentifier[J], END_MARKER, 
0}};
+  const auto Bigram = Token(Token::Kind::Trigram, Chars.data());

ioeric wrote:
> I think we could be more restrictive on bigram generation. I think a bigram 
> prefix of identifier and a bigram prefix of the HEAD substring should work 
> pretty well in practice. For example, for `StringStartsWith`, you would have 
> `st$` and `ss$` (prefix of "SSW").
> 
> WDYT?
Good idea!



Comment at: clang-tools-extra/clangd/index/dex/Trigram.cpp:115
+// FIXME(kbobyrev): Correctly handle empty trigrams "$$$".
 std::vector generateQueryTrigrams(llvm::StringRef Query) {
   // Apply fuzzy matching text segmentation.

ioeric wrote:
> It seems to me that what we need for short queries is simply:
> ```
> if (Query.empty()) {
>// return empty token
> }
> if (Query.size() == 1) return {Query + "$$"};
> if (Query.size() == 2) return {Query + "$"};
> 
> // Longer queries...
> ```
> ?
That would mean that we expect the query to be "valid", i.e. only consist of 
letters and digits. My concern is about what happens if we have `"u_"` or 
something similar (`"_u", "_u_", "$u$"`, etc) - in that case we would actually 
still have to identify the first valid symbol for the trigram, process the 
string (trim it, etc) which looks very similar to what FuzzyMatching 
`calculateRoles` does.

The current approach is rather straightforward and generic, but I can try to 
change it if you want. My biggest concern is fighting some corner cases and 
ensuring that the query is "right" on the user (index) side, which might turn 
out to be more code and ensuring that the "state" is valid throughout the 
pipeline.



Comment at: clang-tools-extra/clangd/index/dex/Trigram.h:74
+/// For short queries (Query contains less than 3 letters and digits) this
+/// returns a single trigram with all valid symbols.
 std::vector generateQueryTrigrams(llvm::StringRef Query);

ioeric wrote:
> I'm not quite sure what this means. Could you elaborate?
Added an example and reflected in the other comment.


https://reviews.llvm.org/D50517



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D50517: [clangd] Generate incomplete trigrams for the Dex index

2018-08-10 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev added a comment.

In https://reviews.llvm.org/D50517#1194990, @ioeric wrote:

> In https://reviews.llvm.org/D50517#1194976, @kbobyrev wrote:
>
> > As discussed offline with @ilya-biryukov, the better approach would be to 
> > prefix match first symbols of each distinct identifier piece instead of 
> > prefix matching (just looking at the first letter of the identifier) the 
> > whole identifier.
> >
> > Example:
> >
> > - Query: `"u"`
> > - Symbols: `"unique_ptr"`, `"user"`, `"super_user"`
> >
> >   Current implementation would match `"unique_ptr"` and `"user"` only. 
> > Proposed implementation would match all three symbols, because the second 
> > piece of `"super_user"` starts with `u`.
>
>
> And in the case where users want to match `super_user`, I think it's 
> reasonable to have users type two more characters and match it with `use`.


That would probably yield lower code completion quality for identifiers like 
`GtkWhatever` which might be very common in pure C projects and elsewhere. 
Also, Ilya mentioned that fuzzy matching filter would significantly increase 
the score of symbols which can be prefix matched and hence they would end up at 
the top if the quality is actually good. Another thing we can do is to boost 
prefix matched symbols if your concern is about them being removed after the 
initial filtering.

I'm personally leaning towards having unigrams for all segment starting 
symbols, but if you believe that it's certainly bad I can change that and in 
the future it will be rather trivial to switch if we decide to go backwards. 
What do you think?


https://reviews.llvm.org/D50517



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D50500: [clangd] Allow consuming limited number of items

2018-08-10 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 160088.
kbobyrev marked 2 inline comments as done.

https://reviews.llvm.org/D50500

Files:
  clang-tools-extra/clangd/index/dex/Iterator.cpp
  clang-tools-extra/clangd/index/dex/Iterator.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp


Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -240,6 +240,27 @@
 "(& (& [1, 3, 5, 8, 9] [1, 5, 7, 9]) (| [0, 5] [0, 1, 5] []))");
 }
 
+TEST(DexIndexIterators, Limit) {
+  const PostingList L0 = {4, 7, 8, 20, 42, 100};
+  const PostingList L1 = {1, 3, 5, 8, 9};
+  const PostingList L2 = {1, 5, 7, 9};
+  const PostingList L3 = {0, 5};
+  const PostingList L4 = {0, 1, 5};
+  const PostingList L5;
+
+  auto DocIterator = create(L0);
+  EXPECT_THAT(consume(*DocIterator, 42), ElementsAre(4, 7, 8, 20, 42, 100));
+
+  DocIterator = create(L0);
+  EXPECT_THAT(consume(*DocIterator), ElementsAre(4, 7, 8, 20, 42, 100));
+
+  DocIterator = create(L0);
+  EXPECT_THAT(consume(*DocIterator, 3), ElementsAre(4, 7, 8));
+
+  DocIterator = create(L0);
+  EXPECT_THAT(consume(*DocIterator, 0), ElementsAre());
+}
+
 testing::Matcher>
 trigramsAre(std::initializer_list Trigrams) {
   std::vector Tokens;
Index: clang-tools-extra/clangd/index/dex/Iterator.h
===
--- clang-tools-extra/clangd/index/dex/Iterator.h
+++ clang-tools-extra/clangd/index/dex/Iterator.h
@@ -101,9 +101,10 @@
   virtual llvm::raw_ostream &dump(llvm::raw_ostream &OS) const = 0;
 };
 
-/// Exhausts given iterator and returns all processed DocIDs. The result
-/// contains sorted DocumentIDs.
-std::vector consume(Iterator &It);
+/// Advances the iterator until it is either exhausted or the number of
+/// requested items is reached. The result contains sorted DocumentIDs.
+std::vector consume(Iterator &It,
+   size_t Limit = std::numeric_limits::max());
 
 /// Returns a document iterator over given PostingList.
 std::unique_ptr create(PostingListRef Documents);
Index: clang-tools-extra/clangd/index/dex/Iterator.cpp
===
--- clang-tools-extra/clangd/index/dex/Iterator.cpp
+++ clang-tools-extra/clangd/index/dex/Iterator.cpp
@@ -218,9 +218,10 @@
 
 } // end namespace
 
-std::vector consume(Iterator &It) {
+std::vector consume(Iterator &It, size_t Limit) {
   std::vector Result;
-  for (; !It.reachedEnd(); It.advance())
+  for (size_t Retrieved = 0; !It.reachedEnd() && Retrieved < Limit;
+   It.advance(), ++Retrieved)
 Result.push_back(It.peek());
   return Result;
 }


Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -240,6 +240,27 @@
 "(& (& [1, 3, 5, 8, 9] [1, 5, 7, 9]) (| [0, 5] [0, 1, 5] []))");
 }
 
+TEST(DexIndexIterators, Limit) {
+  const PostingList L0 = {4, 7, 8, 20, 42, 100};
+  const PostingList L1 = {1, 3, 5, 8, 9};
+  const PostingList L2 = {1, 5, 7, 9};
+  const PostingList L3 = {0, 5};
+  const PostingList L4 = {0, 1, 5};
+  const PostingList L5;
+
+  auto DocIterator = create(L0);
+  EXPECT_THAT(consume(*DocIterator, 42), ElementsAre(4, 7, 8, 20, 42, 100));
+
+  DocIterator = create(L0);
+  EXPECT_THAT(consume(*DocIterator), ElementsAre(4, 7, 8, 20, 42, 100));
+
+  DocIterator = create(L0);
+  EXPECT_THAT(consume(*DocIterator, 3), ElementsAre(4, 7, 8));
+
+  DocIterator = create(L0);
+  EXPECT_THAT(consume(*DocIterator, 0), ElementsAre());
+}
+
 testing::Matcher>
 trigramsAre(std::initializer_list Trigrams) {
   std::vector Tokens;
Index: clang-tools-extra/clangd/index/dex/Iterator.h
===
--- clang-tools-extra/clangd/index/dex/Iterator.h
+++ clang-tools-extra/clangd/index/dex/Iterator.h
@@ -101,9 +101,10 @@
   virtual llvm::raw_ostream &dump(llvm::raw_ostream &OS) const = 0;
 };
 
-/// Exhausts given iterator and returns all processed DocIDs. The result
-/// contains sorted DocumentIDs.
-std::vector consume(Iterator &It);
+/// Advances the iterator until it is either exhausted or the number of
+/// requested items is reached. The result contains sorted DocumentIDs.
+std::vector consume(Iterator &It,
+   size_t Limit = std::numeric_limits::max());
 
 /// Returns a document iterator over given PostingList.
 std::unique_ptr create(PostingListRef Documents);
Index: clang-tools-extra/clangd/index/dex/Iterator.cpp
===
--- clang-tools-extra/clangd/index/dex/Iterator.cpp
+++ clang-tools-extra/clangd/index/dex/Iterator.cpp
@@ -218,9 +218,10 @@
 
 } // end namespace

[PATCH] D50500: [clangd] Allow consuming limited number of items

2018-08-10 Thread Kirill Bobyrev via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL339426: [clangd] Allow consuming limited number of items 
(authored by omtcyfz, committed by ).
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D50500?vs=160088&id=160089#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D50500

Files:
  clang-tools-extra/trunk/clangd/index/dex/Iterator.cpp
  clang-tools-extra/trunk/clangd/index/dex/Iterator.h
  clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp


Index: clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp
@@ -240,6 +240,27 @@
 "(& (& [1, 3, 5, 8, 9] [1, 5, 7, 9]) (| [0, 5] [0, 1, 5] []))");
 }
 
+TEST(DexIndexIterators, Limit) {
+  const PostingList L0 = {4, 7, 8, 20, 42, 100};
+  const PostingList L1 = {1, 3, 5, 8, 9};
+  const PostingList L2 = {1, 5, 7, 9};
+  const PostingList L3 = {0, 5};
+  const PostingList L4 = {0, 1, 5};
+  const PostingList L5;
+
+  auto DocIterator = create(L0);
+  EXPECT_THAT(consume(*DocIterator, 42), ElementsAre(4, 7, 8, 20, 42, 100));
+
+  DocIterator = create(L0);
+  EXPECT_THAT(consume(*DocIterator), ElementsAre(4, 7, 8, 20, 42, 100));
+
+  DocIterator = create(L0);
+  EXPECT_THAT(consume(*DocIterator, 3), ElementsAre(4, 7, 8));
+
+  DocIterator = create(L0);
+  EXPECT_THAT(consume(*DocIterator, 0), ElementsAre());
+}
+
 testing::Matcher>
 trigramsAre(std::initializer_list Trigrams) {
   std::vector Tokens;
Index: clang-tools-extra/trunk/clangd/index/dex/Iterator.cpp
===
--- clang-tools-extra/trunk/clangd/index/dex/Iterator.cpp
+++ clang-tools-extra/trunk/clangd/index/dex/Iterator.cpp
@@ -218,9 +218,10 @@
 
 } // end namespace
 
-std::vector consume(Iterator &It) {
+std::vector consume(Iterator &It, size_t Limit) {
   std::vector Result;
-  for (; !It.reachedEnd(); It.advance())
+  for (size_t Retrieved = 0; !It.reachedEnd() && Retrieved < Limit;
+   It.advance(), ++Retrieved)
 Result.push_back(It.peek());
   return Result;
 }
Index: clang-tools-extra/trunk/clangd/index/dex/Iterator.h
===
--- clang-tools-extra/trunk/clangd/index/dex/Iterator.h
+++ clang-tools-extra/trunk/clangd/index/dex/Iterator.h
@@ -101,9 +101,10 @@
   virtual llvm::raw_ostream &dump(llvm::raw_ostream &OS) const = 0;
 };
 
-/// Exhausts given iterator and returns all processed DocIDs. The result
-/// contains sorted DocumentIDs.
-std::vector consume(Iterator &It);
+/// Advances the iterator until it is either exhausted or the number of
+/// requested items is reached. The result contains sorted DocumentIDs.
+std::vector consume(Iterator &It,
+   size_t Limit = std::numeric_limits::max());
 
 /// Returns a document iterator over given PostingList.
 std::unique_ptr create(PostingListRef Documents);


Index: clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp
@@ -240,6 +240,27 @@
 "(& (& [1, 3, 5, 8, 9] [1, 5, 7, 9]) (| [0, 5] [0, 1, 5] []))");
 }
 
+TEST(DexIndexIterators, Limit) {
+  const PostingList L0 = {4, 7, 8, 20, 42, 100};
+  const PostingList L1 = {1, 3, 5, 8, 9};
+  const PostingList L2 = {1, 5, 7, 9};
+  const PostingList L3 = {0, 5};
+  const PostingList L4 = {0, 1, 5};
+  const PostingList L5;
+
+  auto DocIterator = create(L0);
+  EXPECT_THAT(consume(*DocIterator, 42), ElementsAre(4, 7, 8, 20, 42, 100));
+
+  DocIterator = create(L0);
+  EXPECT_THAT(consume(*DocIterator), ElementsAre(4, 7, 8, 20, 42, 100));
+
+  DocIterator = create(L0);
+  EXPECT_THAT(consume(*DocIterator, 3), ElementsAre(4, 7, 8));
+
+  DocIterator = create(L0);
+  EXPECT_THAT(consume(*DocIterator, 0), ElementsAre());
+}
+
 testing::Matcher>
 trigramsAre(std::initializer_list Trigrams) {
   std::vector Tokens;
Index: clang-tools-extra/trunk/clangd/index/dex/Iterator.cpp
===
--- clang-tools-extra/trunk/clangd/index/dex/Iterator.cpp
+++ clang-tools-extra/trunk/clangd/index/dex/Iterator.cpp
@@ -218,9 +218,10 @@
 
 } // end namespace
 
-std::vector consume(Iterator &It) {
+std::vector consume(Iterator &It, size_t Limit) {
   std::vector Result;
-  for (; !It.reachedEnd(); It.advance())
+  for (size_t Retrieved = 0; !It.reachedEnd() && Retrieved < Limit;
+   It.advance(), ++Retrieved)
 Result.push_back(It.peek());
   return Result;
 }
Index: clang-tools-extra/trunk/clangd/index/dex/Iterator.h
===
--- clang-tools-extra/trunk

[PATCH] D50517: [clangd] Generate incomplete trigrams for the Dex index

2018-08-10 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 160093.
kbobyrev marked 8 inline comments as done.
kbobyrev added a comment.

Address issues we discussed with Eric.


https://reviews.llvm.org/D50517

Files:
  clang-tools-extra/clangd/index/dex/Iterator.h
  clang-tools-extra/clangd/index/dex/Trigram.cpp
  clang-tools-extra/clangd/index/dex/Trigram.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp

Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -250,45 +250,59 @@
 }
 
 TEST(DexIndexTrigrams, IdentifierTrigrams) {
-  EXPECT_THAT(generateIdentifierTrigrams("X86"), trigramsAre({"x86"}));
+  EXPECT_THAT(generateIdentifierTrigrams("X86"),
+  trigramsAre({"x86", "x$$", "x8$", "$$$"}));
 
-  EXPECT_THAT(generateIdentifierTrigrams("nl"), trigramsAre({}));
+  EXPECT_THAT(generateIdentifierTrigrams("nl"),
+  trigramsAre({"nl$", "n$$", "$$$"}));
+
+  EXPECT_THAT(generateIdentifierTrigrams("n"), trigramsAre({"n$$", "$$$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("clangd"),
-  trigramsAre({"cla", "lan", "ang", "ngd"}));
+  trigramsAre({"c$$", "cl$", "cla", "lan", "ang", "ngd", "$$$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("abc_def"),
-  trigramsAre({"abc", "abd", "ade", "bcd", "bde", "cde", "def"}));
+  trigramsAre({"a$$", "abc", "abd", "ade", "bcd", "bde", "cde",
+   "def", "ab$", "ad$", "de$", "$$$"}));
 
-  EXPECT_THAT(
-  generateIdentifierTrigrams("a_b_c_d_e_"),
-  trigramsAre({"abc", "abd", "acd", "ace", "bcd", "bce", "bde", "cde"}));
+  EXPECT_THAT(generateIdentifierTrigrams("a_b_c_d_e_"),
+  trigramsAre({"a$$", "a_$", "abc", "abd", "acd", "ace", "bcd",
+   "bce", "bde", "cde", "ab$", "ac$", "bc$", "bd$",
+   "cd$", "ce$", "de$", "$$$"}));
 
-  EXPECT_THAT(
-  generateIdentifierTrigrams("unique_ptr"),
-  trigramsAre({"uni", "unp", "upt", "niq", "nip", "npt", "iqu", "iqp",
-   "ipt", "que", "qup", "qpt", "uep", "ept", "ptr"}));
+  EXPECT_THAT(generateIdentifierTrigrams("unique_ptr"),
+  trigramsAre({"u$$", "uni", "unp", "upt", "niq", "nip", "npt",
+   "iqu", "iqp", "ipt", "que", "qup", "qpt", "uep",
+   "ept", "ptr", "un$", "up$", "pt$", "$$$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("TUDecl"),
-  trigramsAre({"tud", "tde", "ude", "dec", "ecl"}));
-
-  EXPECT_THAT(generateIdentifierTrigrams("IsOK"),
-  trigramsAre({"iso", "iok", "sok"}));
-
-  EXPECT_THAT(generateIdentifierTrigrams("abc_defGhij__klm"),
-  trigramsAre({
-  "abc", "abd", "abg", "ade", "adg", "adk", "agh", "agk", "bcd",
-  "bcg", "bde", "bdg", "bdk", "bgh", "bgk", "cde", "cdg", "cdk",
-  "cgh", "cgk", "def", "deg", "dek", "dgh", "dgk", "dkl", "efg",
-  "efk", "egh", "egk", "ekl", "fgh", "fgk", "fkl", "ghi", "ghk",
-  "gkl", "hij", "hik", "hkl", "ijk", "ikl", "jkl", "klm",
-  }));
+  trigramsAre({"t$$", "tud", "tde", "ude", "dec", "ecl", "tu$",
+   "td$", "de$", "$$$"}));
+
+  EXPECT_THAT(
+  generateIdentifierTrigrams("IsOK"),
+  trigramsAre({"i$$", "iso", "iok", "sok", "is$", "io$", "ok$", "$$$"}));
+
+  EXPECT_THAT(
+  generateIdentifierTrigrams("abc_defGhij__klm"),
+  trigramsAre({"a$$", "abc", "abd", "abg", "ade", "adg", "adk", "agh",
+   "agk", "bcd", "bcg", "bde", "bdg", "bdk", "bgh", "bgk",
+   "cde", "cdg", "cdk", "cgh", "cgk", "def", "deg", "dek",
+   "dgh", "dgk", "dkl", "efg", "efk", "egh", "egk", "ekl",
+   "fgh", "fgk", "fkl", "ghi", "ghk", "gkl", "hij", "hik",
+   "hkl", "ijk", "ikl", "jkl", "klm", "ab$", "ad$", "ag$",
+   "de$", "dg$", "dk$", "gh$", "gk$", "kl$", "$$$"}));
 }
 
 TEST(DexIndexTrigrams, QueryTrigrams) {
-  EXPECT_THAT(generateQueryTrigrams("X86"), trigramsAre({"x86"}));
+  EXPECT_THAT(generateQueryTrigrams("c"), trigramsAre({"c$$"}));
+  EXPECT_THAT(generateQueryTrigrams("cl"), trigramsAre({"cl$"}));
+  EXPECT_THAT(generateQueryTrigrams("cla"), trigramsAre({"cla"}));
 
-  EXPECT_THAT(generateQueryTrigrams("nl"), trigramsAre({}));
+  EXPECT_THAT(generateQueryTrigrams("__b"), trigramsAre({"__$"}));
+  EXPECT_THAT(generateQueryTrigrams("_"), trigramsAre({"_$$"}));
+
+  EXPECT_THAT(generateQueryTrigrams("X86"), trigramsAre({"x86"}));
 
   EXPECT_THAT(generateQueryTrigrams("clangd"),
   trigramsAre({"cla", "lan", "ang", "ngd"}));
Index: clang-tools-extra/clangd/index/dex/Trigram.h
===
--- clang-tools-extra/clangd/index/dex/Tr

[PATCH] D50337: [clangd] DexIndex implementation prototype

2018-08-10 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 160104.
kbobyrev marked 12 inline comments as done.
kbobyrev added a comment.

Address most comments.


https://reviews.llvm.org/D50337

Files:
  clang-tools-extra/clangd/CMakeLists.txt
  clang-tools-extra/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/clangd/index/dex/DexIndex.h
  clang-tools-extra/clangd/index/dex/Token.h
  clang-tools-extra/unittests/clangd/CMakeLists.txt
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp
  clang-tools-extra/unittests/clangd/IndexTests.cpp
  clang-tools-extra/unittests/clangd/TestIndexOperations.cpp
  clang-tools-extra/unittests/clangd/TestIndexOperations.h

Index: clang-tools-extra/unittests/clangd/TestIndexOperations.h
===
--- /dev/null
+++ clang-tools-extra/unittests/clangd/TestIndexOperations.h
@@ -0,0 +1,57 @@
+//===-- IndexHelpers.h --*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#ifndef LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_INDEXTESTCOMMON_H
+#define LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_INDEXTESTCOMMON_H
+
+#include "index/Index.h"
+#include "index/Merge.h"
+#include "index/dex/DexIndex.h"
+#include "index/dex/Iterator.h"
+#include "index/dex/Token.h"
+#include "index/dex/Trigram.h"
+
+namespace clang {
+namespace clangd {
+
+Symbol symbol(llvm::StringRef QName);
+
+struct SlabAndPointers {
+  SymbolSlab Slab;
+  std::vector Pointers;
+};
+
+// Create a slab of symbols with the given qualified names as both IDs and
+// names. The life time of the slab is managed by the returned shared pointer.
+// If \p WeakSymbols is provided, it will be pointed to the managed object in
+// the returned shared pointer.
+std::shared_ptr>
+generateSymbols(std::vector QualifiedNames,
+std::weak_ptr *WeakSymbols = nullptr);
+
+// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
+// to the `generateSymbols` above.
+std::shared_ptr>
+generateNumSymbols(int Begin, int End,
+   std::weak_ptr *WeakSymbols = nullptr);
+
+std::string getQualifiedName(const Symbol &Sym);
+
+std::vector match(const SymbolIndex &I,
+   const FuzzyFindRequest &Req,
+   bool *Incomplete = nullptr);
+
+// Returns qualified names of symbols with any of IDs in the index.
+std::vector lookup(const SymbolIndex &I,
+llvm::ArrayRef IDs);
+
+} // namespace clangd
+} // namespace clang
+
+#endif
Index: clang-tools-extra/unittests/clangd/TestIndexOperations.cpp
===
--- /dev/null
+++ clang-tools-extra/unittests/clangd/TestIndexOperations.cpp
@@ -0,0 +1,89 @@
+//===-- IndexHelpers.cpp *- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "TestIndexOperations.h"
+
+namespace clang {
+namespace clangd {
+
+Symbol symbol(llvm::StringRef QName) {
+  Symbol Sym;
+  Sym.ID = SymbolID(QName.str());
+  size_t Pos = QName.rfind("::");
+  if (Pos == llvm::StringRef::npos) {
+Sym.Name = QName;
+Sym.Scope = "";
+  } else {
+Sym.Name = QName.substr(Pos + 2);
+Sym.Scope = QName.substr(0, Pos + 2);
+  }
+  return Sym;
+}
+
+// Create a slab of symbols with the given qualified names as both IDs and
+// names. The life time of the slab is managed by the returned shared pointer.
+// If \p WeakSymbols is provided, it will be pointed to the managed object in
+// the returned shared pointer.
+std::shared_ptr>
+generateSymbols(std::vector QualifiedNames,
+std::weak_ptr *WeakSymbols) {
+  SymbolSlab::Builder Slab;
+  for (llvm::StringRef QName : QualifiedNames)
+Slab.insert(symbol(QName));
+
+  auto Storage = std::make_shared();
+  Storage->Slab = std::move(Slab).build();
+  for (const auto &Sym : Storage->Slab)
+Storage->Pointers.push_back(&Sym);
+  if (WeakSymbols)
+*WeakSymbols = Storage;
+  auto *Pointers = &Storage->Pointers;
+  return {std::move(Storage), Pointers};
+}
+
+// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
+// to the `generateSymbols` above.
+std::shared_ptr>
+generateNumSymbols(int Begin, int End,
+   std::weak_ptr *WeakSymbols) {
+  std::vector Names;
+  for (int i = Begin; i <= End; i++)
+Names.push_back(std::to_string(i));
+  return generateSymbols(Names, WeakSymbols);
+}
+
+std::string getQualifiedName(const Symbol &Sym) {
+  return (Sym.Scope + Sym.Name)

[PATCH] D50517: [clangd] Generate incomplete trigrams for the Dex index

2018-08-10 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 160133.
kbobyrev marked 7 inline comments as done.
kbobyrev added a comment.

Address issues we have discussed with Eric.


https://reviews.llvm.org/D50517

Files:
  clang-tools-extra/clangd/index/dex/Iterator.h
  clang-tools-extra/clangd/index/dex/Trigram.cpp
  clang-tools-extra/clangd/index/dex/Trigram.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp

Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -250,45 +250,57 @@
 }
 
 TEST(DexIndexTrigrams, IdentifierTrigrams) {
-  EXPECT_THAT(generateIdentifierTrigrams("X86"), trigramsAre({"x86"}));
+  EXPECT_THAT(generateIdentifierTrigrams("X86"),
+  trigramsAre({"x86", "x$$", "x8$", "$$$"}));
 
-  EXPECT_THAT(generateIdentifierTrigrams("nl"), trigramsAre({}));
+  EXPECT_THAT(generateIdentifierTrigrams("nl"),
+  trigramsAre({"nl$", "n$$", "$$$"}));
+
+  EXPECT_THAT(generateIdentifierTrigrams("n"), trigramsAre({"n$$", "$$$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("clangd"),
-  trigramsAre({"cla", "lan", "ang", "ngd"}));
+  trigramsAre({"c$$", "cl$", "cla", "lan", "ang", "ngd", "$$$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("abc_def"),
-  trigramsAre({"abc", "abd", "ade", "bcd", "bde", "cde", "def"}));
+  trigramsAre({"a$$", "abc", "abd", "ade", "bcd", "bde", "cde",
+   "def", "ab$", "ad$", "$$$"}));
 
-  EXPECT_THAT(
-  generateIdentifierTrigrams("a_b_c_d_e_"),
-  trigramsAre({"abc", "abd", "acd", "ace", "bcd", "bce", "bde", "cde"}));
+  EXPECT_THAT(generateIdentifierTrigrams("a_b_c_d_e_"),
+  trigramsAre({"a$$", "a_$", "a_b", "abc", "abd", "acd", "ace",
+   "bcd", "bce", "bde", "cde", "ab$", "$$$"}));
 
-  EXPECT_THAT(
-  generateIdentifierTrigrams("unique_ptr"),
-  trigramsAre({"uni", "unp", "upt", "niq", "nip", "npt", "iqu", "iqp",
-   "ipt", "que", "qup", "qpt", "uep", "ept", "ptr"}));
+  EXPECT_THAT(generateIdentifierTrigrams("unique_ptr"),
+  trigramsAre({"u$$", "uni", "unp", "upt", "niq", "nip", "npt",
+   "iqu", "iqp", "ipt", "que", "qup", "qpt", "uep",
+   "ept", "ptr", "un$", "up$", "$$$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("TUDecl"),
-  trigramsAre({"tud", "tde", "ude", "dec", "ecl"}));
+  trigramsAre({"t$$", "tud", "tde", "ude", "dec", "ecl", "tu$",
+   "td$", "$$$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("IsOK"),
-  trigramsAre({"iso", "iok", "sok"}));
-
-  EXPECT_THAT(generateIdentifierTrigrams("abc_defGhij__klm"),
-  trigramsAre({
-  "abc", "abd", "abg", "ade", "adg", "adk", "agh", "agk", "bcd",
-  "bcg", "bde", "bdg", "bdk", "bgh", "bgk", "cde", "cdg", "cdk",
-  "cgh", "cgk", "def", "deg", "dek", "dgh", "dgk", "dkl", "efg",
-  "efk", "egh", "egk", "ekl", "fgh", "fgk", "fkl", "ghi", "ghk",
-  "gkl", "hij", "hik", "hkl", "ijk", "ikl", "jkl", "klm",
-  }));
+  trigramsAre({"i$$", "iso", "iok", "sok", "is$", "io$", "$$$"}));
+
+  EXPECT_THAT(
+  generateIdentifierTrigrams("abc_defGhij__klm"),
+  trigramsAre({"a$$", "abc", "abd", "abg", "ade", "adg", "adk", "agh",
+   "agk", "bcd", "bcg", "bde", "bdg", "bdk", "bgh", "bgk",
+   "cde", "cdg", "cdk", "cgh", "cgk", "def", "deg", "dek",
+   "dgh", "dgk", "dkl", "efg", "efk", "egh", "egk", "ekl",
+   "fgh", "fgk", "fkl", "ghi", "ghk", "gkl", "hij", "hik",
+   "hkl", "ijk", "ikl", "jkl", "klm", "ab$", "ad$", "$$$"}));
 }
 
 TEST(DexIndexTrigrams, QueryTrigrams) {
-  EXPECT_THAT(generateQueryTrigrams("X86"), trigramsAre({"x86"}));
+  EXPECT_THAT(generateQueryTrigrams("c"), trigramsAre({"c$$"}));
+  EXPECT_THAT(generateQueryTrigrams("cl"), trigramsAre({"cl$"}));
+  EXPECT_THAT(generateQueryTrigrams("cla"), trigramsAre({"cla"}));
 
-  EXPECT_THAT(generateQueryTrigrams("nl"), trigramsAre({}));
+  EXPECT_THAT(generateQueryTrigrams("_"), trigramsAre({"_$$"}));
+  EXPECT_THAT(generateQueryTrigrams("__"), trigramsAre({"__$"}));
+  EXPECT_THAT(generateQueryTrigrams("___"), trigramsAre({"___"}));
+
+  EXPECT_THAT(generateQueryTrigrams("X86"), trigramsAre({"x86"}));
 
   EXPECT_THAT(generateQueryTrigrams("clangd"),
   trigramsAre({"cla", "lan", "ang", "ngd"}));
Index: clang-tools-extra/clangd/index/dex/Trigram.h
===
--- clang-tools-extra/clangd/index/dex/Trigram.h
+++ clang-tools-extra/clangd/index/dex/Trigram.h
@@ -36,14 +36,26 @@
 /// First, given Identifier (unqualified symbol name) is segmen

[PATCH] D50337: [clangd] DexIndex implementation prototype

2018-08-10 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 160146.
kbobyrev marked 2 inline comments as done.
kbobyrev added a comment.

Store symbol qualities (so that it's not computed each time when requested 
which might be expensive). Use `operator[]` to construct the value for inverted 
index when key is not inserted yet.


https://reviews.llvm.org/D50337

Files:
  clang-tools-extra/clangd/CMakeLists.txt
  clang-tools-extra/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/clangd/index/dex/DexIndex.h
  clang-tools-extra/clangd/index/dex/Token.h
  clang-tools-extra/unittests/clangd/CMakeLists.txt
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp
  clang-tools-extra/unittests/clangd/IndexTests.cpp
  clang-tools-extra/unittests/clangd/TestIndexOperations.cpp
  clang-tools-extra/unittests/clangd/TestIndexOperations.h

Index: clang-tools-extra/unittests/clangd/TestIndexOperations.h
===
--- /dev/null
+++ clang-tools-extra/unittests/clangd/TestIndexOperations.h
@@ -0,0 +1,57 @@
+//===-- IndexHelpers.h --*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#ifndef LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_INDEXTESTCOMMON_H
+#define LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_INDEXTESTCOMMON_H
+
+#include "index/Index.h"
+#include "index/Merge.h"
+#include "index/dex/DexIndex.h"
+#include "index/dex/Iterator.h"
+#include "index/dex/Token.h"
+#include "index/dex/Trigram.h"
+
+namespace clang {
+namespace clangd {
+
+Symbol symbol(llvm::StringRef QName);
+
+struct SlabAndPointers {
+  SymbolSlab Slab;
+  std::vector Pointers;
+};
+
+// Create a slab of symbols with the given qualified names as both IDs and
+// names. The life time of the slab is managed by the returned shared pointer.
+// If \p WeakSymbols is provided, it will be pointed to the managed object in
+// the returned shared pointer.
+std::shared_ptr>
+generateSymbols(std::vector QualifiedNames,
+std::weak_ptr *WeakSymbols = nullptr);
+
+// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
+// to the `generateSymbols` above.
+std::shared_ptr>
+generateNumSymbols(int Begin, int End,
+   std::weak_ptr *WeakSymbols = nullptr);
+
+std::string getQualifiedName(const Symbol &Sym);
+
+std::vector match(const SymbolIndex &I,
+   const FuzzyFindRequest &Req,
+   bool *Incomplete = nullptr);
+
+// Returns qualified names of symbols with any of IDs in the index.
+std::vector lookup(const SymbolIndex &I,
+llvm::ArrayRef IDs);
+
+} // namespace clangd
+} // namespace clang
+
+#endif
Index: clang-tools-extra/unittests/clangd/TestIndexOperations.cpp
===
--- /dev/null
+++ clang-tools-extra/unittests/clangd/TestIndexOperations.cpp
@@ -0,0 +1,89 @@
+//===-- IndexHelpers.cpp *- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "TestIndexOperations.h"
+
+namespace clang {
+namespace clangd {
+
+Symbol symbol(llvm::StringRef QName) {
+  Symbol Sym;
+  Sym.ID = SymbolID(QName.str());
+  size_t Pos = QName.rfind("::");
+  if (Pos == llvm::StringRef::npos) {
+Sym.Name = QName;
+Sym.Scope = "";
+  } else {
+Sym.Name = QName.substr(Pos + 2);
+Sym.Scope = QName.substr(0, Pos + 2);
+  }
+  return Sym;
+}
+
+// Create a slab of symbols with the given qualified names as both IDs and
+// names. The life time of the slab is managed by the returned shared pointer.
+// If \p WeakSymbols is provided, it will be pointed to the managed object in
+// the returned shared pointer.
+std::shared_ptr>
+generateSymbols(std::vector QualifiedNames,
+std::weak_ptr *WeakSymbols) {
+  SymbolSlab::Builder Slab;
+  for (llvm::StringRef QName : QualifiedNames)
+Slab.insert(symbol(QName));
+
+  auto Storage = std::make_shared();
+  Storage->Slab = std::move(Slab).build();
+  for (const auto &Sym : Storage->Slab)
+Storage->Pointers.push_back(&Sym);
+  if (WeakSymbols)
+*WeakSymbols = Storage;
+  auto *Pointers = &Storage->Pointers;
+  return {std::move(Storage), Pointers};
+}
+
+// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
+// to the `generateSymbols` above.
+std::shared_ptr>
+generateNumSymbols(int Begin, int End,
+   std::weak_ptr *WeakSymbols) {
+  std::vector Names;
+  for (int i = Begin; i <= End; i++)
+Na

[PATCH] D50517: [clangd] Generate incomplete trigrams for the Dex index

2018-08-10 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 160154.
kbobyrev marked 7 inline comments as done.
kbobyrev added a comment.

Address a round of comments.


https://reviews.llvm.org/D50517

Files:
  clang-tools-extra/clangd/index/dex/Iterator.h
  clang-tools-extra/clangd/index/dex/Trigram.cpp
  clang-tools-extra/clangd/index/dex/Trigram.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp

Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -250,45 +250,57 @@
 }
 
 TEST(DexIndexTrigrams, IdentifierTrigrams) {
-  EXPECT_THAT(generateIdentifierTrigrams("X86"), trigramsAre({"x86"}));
+  EXPECT_THAT(generateIdentifierTrigrams("X86"),
+  trigramsAre({"x86", "x$$", "x8$", "$$$"}));
 
-  EXPECT_THAT(generateIdentifierTrigrams("nl"), trigramsAre({}));
+  EXPECT_THAT(generateIdentifierTrigrams("nl"),
+  trigramsAre({"nl$", "n$$", "$$$"}));
+
+  EXPECT_THAT(generateIdentifierTrigrams("n"), trigramsAre({"n$$", "$$$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("clangd"),
-  trigramsAre({"cla", "lan", "ang", "ngd"}));
+  trigramsAre({"c$$", "cl$", "cla", "lan", "ang", "ngd", "$$$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("abc_def"),
-  trigramsAre({"abc", "abd", "ade", "bcd", "bde", "cde", "def"}));
+  trigramsAre({"a$$", "abc", "abd", "ade", "bcd", "bde", "cde",
+   "def", "ab$", "ad$", "$$$"}));
 
-  EXPECT_THAT(
-  generateIdentifierTrigrams("a_b_c_d_e_"),
-  trigramsAre({"abc", "abd", "acd", "ace", "bcd", "bce", "bde", "cde"}));
+  EXPECT_THAT(generateIdentifierTrigrams("a_b_c_d_e_"),
+  trigramsAre({"a$$", "a_$", "a_b", "abc", "abd", "acd", "ace",
+   "bcd", "bce", "bde", "cde", "ab$", "$$$"}));
 
-  EXPECT_THAT(
-  generateIdentifierTrigrams("unique_ptr"),
-  trigramsAre({"uni", "unp", "upt", "niq", "nip", "npt", "iqu", "iqp",
-   "ipt", "que", "qup", "qpt", "uep", "ept", "ptr"}));
+  EXPECT_THAT(generateIdentifierTrigrams("unique_ptr"),
+  trigramsAre({"u$$", "uni", "unp", "upt", "niq", "nip", "npt",
+   "iqu", "iqp", "ipt", "que", "qup", "qpt", "uep",
+   "ept", "ptr", "un$", "up$", "$$$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("TUDecl"),
-  trigramsAre({"tud", "tde", "ude", "dec", "ecl"}));
+  trigramsAre({"t$$", "tud", "tde", "ude", "dec", "ecl", "tu$",
+   "td$", "$$$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("IsOK"),
-  trigramsAre({"iso", "iok", "sok"}));
-
-  EXPECT_THAT(generateIdentifierTrigrams("abc_defGhij__klm"),
-  trigramsAre({
-  "abc", "abd", "abg", "ade", "adg", "adk", "agh", "agk", "bcd",
-  "bcg", "bde", "bdg", "bdk", "bgh", "bgk", "cde", "cdg", "cdk",
-  "cgh", "cgk", "def", "deg", "dek", "dgh", "dgk", "dkl", "efg",
-  "efk", "egh", "egk", "ekl", "fgh", "fgk", "fkl", "ghi", "ghk",
-  "gkl", "hij", "hik", "hkl", "ijk", "ikl", "jkl", "klm",
-  }));
+  trigramsAre({"i$$", "iso", "iok", "sok", "is$", "io$", "$$$"}));
+
+  EXPECT_THAT(
+  generateIdentifierTrigrams("abc_defGhij__klm"),
+  trigramsAre({"a$$", "abc", "abd", "abg", "ade", "adg", "adk", "agh",
+   "agk", "bcd", "bcg", "bde", "bdg", "bdk", "bgh", "bgk",
+   "cde", "cdg", "cdk", "cgh", "cgk", "def", "deg", "dek",
+   "dgh", "dgk", "dkl", "efg", "efk", "egh", "egk", "ekl",
+   "fgh", "fgk", "fkl", "ghi", "ghk", "gkl", "hij", "hik",
+   "hkl", "ijk", "ikl", "jkl", "klm", "ab$", "ad$", "$$$"}));
 }
 
 TEST(DexIndexTrigrams, QueryTrigrams) {
-  EXPECT_THAT(generateQueryTrigrams("X86"), trigramsAre({"x86"}));
+  EXPECT_THAT(generateQueryTrigrams("c"), trigramsAre({"c$$"}));
+  EXPECT_THAT(generateQueryTrigrams("cl"), trigramsAre({"cl$"}));
+  EXPECT_THAT(generateQueryTrigrams("cla"), trigramsAre({"cla"}));
 
-  EXPECT_THAT(generateQueryTrigrams("nl"), trigramsAre({}));
+  EXPECT_THAT(generateQueryTrigrams("_"), trigramsAre({"_$$"}));
+  EXPECT_THAT(generateQueryTrigrams("__"), trigramsAre({"__$"}));
+  EXPECT_THAT(generateQueryTrigrams("___"), trigramsAre({"___"}));
+
+  EXPECT_THAT(generateQueryTrigrams("X86"), trigramsAre({"x86"}));
 
   EXPECT_THAT(generateQueryTrigrams("clangd"),
   trigramsAre({"cla", "lan", "ang", "ngd"}));
Index: clang-tools-extra/clangd/index/dex/Trigram.h
===
--- clang-tools-extra/clangd/index/dex/Trigram.h
+++ clang-tools-extra/clangd/index/dex/Trigram.h
@@ -36,14 +36,20 @@
 /// First, given Identifier (unqualified symbol name) is segmented using
 /// 

[PATCH] D50576: [clangd] Allow consumption of DocIDs without overhead

2018-08-10 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev created this revision.
kbobyrev added reviewers: ioeric, ilya-biryukov.
Herald added subscribers: arphaman, jkorous, MaskRay.

This patch allows processing DocIDs from iterator using callback so that they 
are not stored in a vector if actual DocIDs are not needed.

Such overhead is the case for https://reviews.llvm.org/D50337 patch: 
`fuzzyFindLongQuery` stores `SymbolDocIDs` but they are thrown away later, 
because what the index really needs is `std::vector> Scores;` and it can be filled on-the-fly in `matchSymbols`.


https://reviews.llvm.org/D50576

Files:
  clang-tools-extra/clangd/index/dex/Iterator.cpp
  clang-tools-extra/clangd/index/dex/Iterator.h


Index: clang-tools-extra/clangd/index/dex/Iterator.h
===
--- clang-tools-extra/clangd/index/dex/Iterator.h
+++ clang-tools-extra/clangd/index/dex/Iterator.h
@@ -101,8 +101,14 @@
   virtual llvm::raw_ostream &dump(llvm::raw_ostream &OS) const = 0;
 };
 
-/// Advances the iterator until it is either exhausted or the number of
-/// requested items is reached. The result contains sorted DocumentIDs.
+/// Advances given iterator until it is exhausted or the requested number of
+/// symbols is already seen while using callback on each processed DocID.
+void matchSymbols(Iterator &It,
+  llvm::function_ref Callback,
+  size_t Limit = std::numeric_limits::max());
+
+/// Advances given iterator until it is either exhausted or the number of
+/// requested items is reached and returns all seen DocIDs.
 std::vector consume(Iterator &It,
size_t Limit = std::numeric_limits::max());
 
Index: clang-tools-extra/clangd/index/dex/Iterator.cpp
===
--- clang-tools-extra/clangd/index/dex/Iterator.cpp
+++ clang-tools-extra/clangd/index/dex/Iterator.cpp
@@ -220,10 +220,16 @@
 
 std::vector consume(Iterator &It, size_t Limit) {
   std::vector Result;
+  matchSymbols(It, [&](const DocID &ID) { Result.push_back(ID); }, Limit);
+  return Result;
+}
+
+void matchSymbols(Iterator &It,
+  llvm::function_ref Callback,
+  size_t Limit) {
   for (size_t Retrieved = 0; !It.reachedEnd() && Retrieved < Limit;
It.advance(), ++Retrieved)
-Result.push_back(It.peek());
-  return Result;
+Callback(It.peek());
 }
 
 std::unique_ptr create(PostingListRef Documents) {


Index: clang-tools-extra/clangd/index/dex/Iterator.h
===
--- clang-tools-extra/clangd/index/dex/Iterator.h
+++ clang-tools-extra/clangd/index/dex/Iterator.h
@@ -101,8 +101,14 @@
   virtual llvm::raw_ostream &dump(llvm::raw_ostream &OS) const = 0;
 };
 
-/// Advances the iterator until it is either exhausted or the number of
-/// requested items is reached. The result contains sorted DocumentIDs.
+/// Advances given iterator until it is exhausted or the requested number of
+/// symbols is already seen while using callback on each processed DocID.
+void matchSymbols(Iterator &It,
+  llvm::function_ref Callback,
+  size_t Limit = std::numeric_limits::max());
+
+/// Advances given iterator until it is either exhausted or the number of
+/// requested items is reached and returns all seen DocIDs.
 std::vector consume(Iterator &It,
size_t Limit = std::numeric_limits::max());
 
Index: clang-tools-extra/clangd/index/dex/Iterator.cpp
===
--- clang-tools-extra/clangd/index/dex/Iterator.cpp
+++ clang-tools-extra/clangd/index/dex/Iterator.cpp
@@ -220,10 +220,16 @@
 
 std::vector consume(Iterator &It, size_t Limit) {
   std::vector Result;
+  matchSymbols(It, [&](const DocID &ID) { Result.push_back(ID); }, Limit);
+  return Result;
+}
+
+void matchSymbols(Iterator &It,
+  llvm::function_ref Callback,
+  size_t Limit) {
   for (size_t Retrieved = 0; !It.reachedEnd() && Retrieved < Limit;
It.advance(), ++Retrieved)
-Result.push_back(It.peek());
-  return Result;
+Callback(It.peek());
 }
 
 std::unique_ptr create(PostingListRef Documents) {
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D50517: [clangd] Generate incomplete trigrams for the Dex index

2018-08-13 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 160302.
kbobyrev marked an inline comment as done.
kbobyrev edited the summary of this revision.
kbobyrev added a comment.

Address the post-LGTM comment.


https://reviews.llvm.org/D50517

Files:
  clang-tools-extra/clangd/index/dex/Iterator.h
  clang-tools-extra/clangd/index/dex/Trigram.cpp
  clang-tools-extra/clangd/index/dex/Trigram.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp

Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -250,45 +250,57 @@
 }
 
 TEST(DexIndexTrigrams, IdentifierTrigrams) {
-  EXPECT_THAT(generateIdentifierTrigrams("X86"), trigramsAre({"x86"}));
+  EXPECT_THAT(generateIdentifierTrigrams("X86"),
+  trigramsAre({"x86", "x$$", "x8$", "$$$"}));
 
-  EXPECT_THAT(generateIdentifierTrigrams("nl"), trigramsAre({}));
+  EXPECT_THAT(generateIdentifierTrigrams("nl"),
+  trigramsAre({"nl$", "n$$", "$$$"}));
+
+  EXPECT_THAT(generateIdentifierTrigrams("n"), trigramsAre({"n$$", "$$$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("clangd"),
-  trigramsAre({"cla", "lan", "ang", "ngd"}));
+  trigramsAre({"c$$", "cl$", "cla", "lan", "ang", "ngd", "$$$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("abc_def"),
-  trigramsAre({"abc", "abd", "ade", "bcd", "bde", "cde", "def"}));
+  trigramsAre({"a$$", "abc", "abd", "ade", "bcd", "bde", "cde",
+   "def", "ab$", "ad$", "$$$"}));
 
-  EXPECT_THAT(
-  generateIdentifierTrigrams("a_b_c_d_e_"),
-  trigramsAre({"abc", "abd", "acd", "ace", "bcd", "bce", "bde", "cde"}));
+  EXPECT_THAT(generateIdentifierTrigrams("a_b_c_d_e_"),
+  trigramsAre({"a$$", "a_$", "a_b", "abc", "abd", "acd", "ace",
+   "bcd", "bce", "bde", "cde", "ab$", "$$$"}));
 
-  EXPECT_THAT(
-  generateIdentifierTrigrams("unique_ptr"),
-  trigramsAre({"uni", "unp", "upt", "niq", "nip", "npt", "iqu", "iqp",
-   "ipt", "que", "qup", "qpt", "uep", "ept", "ptr"}));
+  EXPECT_THAT(generateIdentifierTrigrams("unique_ptr"),
+  trigramsAre({"u$$", "uni", "unp", "upt", "niq", "nip", "npt",
+   "iqu", "iqp", "ipt", "que", "qup", "qpt", "uep",
+   "ept", "ptr", "un$", "up$", "$$$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("TUDecl"),
-  trigramsAre({"tud", "tde", "ude", "dec", "ecl"}));
+  trigramsAre({"t$$", "tud", "tde", "ude", "dec", "ecl", "tu$",
+   "td$", "$$$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("IsOK"),
-  trigramsAre({"iso", "iok", "sok"}));
-
-  EXPECT_THAT(generateIdentifierTrigrams("abc_defGhij__klm"),
-  trigramsAre({
-  "abc", "abd", "abg", "ade", "adg", "adk", "agh", "agk", "bcd",
-  "bcg", "bde", "bdg", "bdk", "bgh", "bgk", "cde", "cdg", "cdk",
-  "cgh", "cgk", "def", "deg", "dek", "dgh", "dgk", "dkl", "efg",
-  "efk", "egh", "egk", "ekl", "fgh", "fgk", "fkl", "ghi", "ghk",
-  "gkl", "hij", "hik", "hkl", "ijk", "ikl", "jkl", "klm",
-  }));
+  trigramsAre({"i$$", "iso", "iok", "sok", "is$", "io$", "$$$"}));
+
+  EXPECT_THAT(
+  generateIdentifierTrigrams("abc_defGhij__klm"),
+  trigramsAre({"a$$", "abc", "abd", "abg", "ade", "adg", "adk", "agh",
+   "agk", "bcd", "bcg", "bde", "bdg", "bdk", "bgh", "bgk",
+   "cde", "cdg", "cdk", "cgh", "cgk", "def", "deg", "dek",
+   "dgh", "dgk", "dkl", "efg", "efk", "egh", "egk", "ekl",
+   "fgh", "fgk", "fkl", "ghi", "ghk", "gkl", "hij", "hik",
+   "hkl", "ijk", "ikl", "jkl", "klm", "ab$", "ad$", "$$$"}));
 }
 
 TEST(DexIndexTrigrams, QueryTrigrams) {
-  EXPECT_THAT(generateQueryTrigrams("X86"), trigramsAre({"x86"}));
+  EXPECT_THAT(generateQueryTrigrams("c"), trigramsAre({"c$$"}));
+  EXPECT_THAT(generateQueryTrigrams("cl"), trigramsAre({"cl$"}));
+  EXPECT_THAT(generateQueryTrigrams("cla"), trigramsAre({"cla"}));
 
-  EXPECT_THAT(generateQueryTrigrams("nl"), trigramsAre({}));
+  EXPECT_THAT(generateQueryTrigrams("_"), trigramsAre({"_$$"}));
+  EXPECT_THAT(generateQueryTrigrams("__"), trigramsAre({"__$"}));
+  EXPECT_THAT(generateQueryTrigrams("___"), trigramsAre({"___"}));
+
+  EXPECT_THAT(generateQueryTrigrams("X86"), trigramsAre({"x86"}));
 
   EXPECT_THAT(generateQueryTrigrams("clangd"),
   trigramsAre({"cla", "lan", "ang", "ngd"}));
Index: clang-tools-extra/clangd/index/dex/Trigram.h
===
--- clang-tools-extra/clangd/index/dex/Trigram.h
+++ clang-tools-extra/clangd/index/dex/Trigram.h
@@ -36,14 +36,20 @@
 /// First, given Identifier (u

[PATCH] D50517: [clangd] Generate incomplete trigrams for the Dex index

2018-08-13 Thread Kirill Bobyrev via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL339548: [clangd] Generate incomplete trigrams for the Dex 
index (authored by omtcyfz, committed by ).
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D50517?vs=160302&id=160310#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D50517

Files:
  clang-tools-extra/trunk/clangd/index/dex/Iterator.h
  clang-tools-extra/trunk/clangd/index/dex/Trigram.cpp
  clang-tools-extra/trunk/clangd/index/dex/Trigram.h
  clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp

Index: clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp
@@ -271,45 +271,57 @@
 }
 
 TEST(DexIndexTrigrams, IdentifierTrigrams) {
-  EXPECT_THAT(generateIdentifierTrigrams("X86"), trigramsAre({"x86"}));
+  EXPECT_THAT(generateIdentifierTrigrams("X86"),
+  trigramsAre({"x86", "x$$", "x8$", "$$$"}));
 
-  EXPECT_THAT(generateIdentifierTrigrams("nl"), trigramsAre({}));
+  EXPECT_THAT(generateIdentifierTrigrams("nl"),
+  trigramsAre({"nl$", "n$$", "$$$"}));
+
+  EXPECT_THAT(generateIdentifierTrigrams("n"), trigramsAre({"n$$", "$$$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("clangd"),
-  trigramsAre({"cla", "lan", "ang", "ngd"}));
+  trigramsAre({"c$$", "cl$", "cla", "lan", "ang", "ngd", "$$$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("abc_def"),
-  trigramsAre({"abc", "abd", "ade", "bcd", "bde", "cde", "def"}));
-
-  EXPECT_THAT(
-  generateIdentifierTrigrams("a_b_c_d_e_"),
-  trigramsAre({"abc", "abd", "acd", "ace", "bcd", "bce", "bde", "cde"}));
+  trigramsAre({"a$$", "abc", "abd", "ade", "bcd", "bde", "cde",
+   "def", "ab$", "ad$", "$$$"}));
 
-  EXPECT_THAT(
-  generateIdentifierTrigrams("unique_ptr"),
-  trigramsAre({"uni", "unp", "upt", "niq", "nip", "npt", "iqu", "iqp",
-   "ipt", "que", "qup", "qpt", "uep", "ept", "ptr"}));
+  EXPECT_THAT(generateIdentifierTrigrams("a_b_c_d_e_"),
+  trigramsAre({"a$$", "a_$", "a_b", "abc", "abd", "acd", "ace",
+   "bcd", "bce", "bde", "cde", "ab$", "$$$"}));
+
+  EXPECT_THAT(generateIdentifierTrigrams("unique_ptr"),
+  trigramsAre({"u$$", "uni", "unp", "upt", "niq", "nip", "npt",
+   "iqu", "iqp", "ipt", "que", "qup", "qpt", "uep",
+   "ept", "ptr", "un$", "up$", "$$$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("TUDecl"),
-  trigramsAre({"tud", "tde", "ude", "dec", "ecl"}));
+  trigramsAre({"t$$", "tud", "tde", "ude", "dec", "ecl", "tu$",
+   "td$", "$$$"}));
 
   EXPECT_THAT(generateIdentifierTrigrams("IsOK"),
-  trigramsAre({"iso", "iok", "sok"}));
+  trigramsAre({"i$$", "iso", "iok", "sok", "is$", "io$", "$$$"}));
 
-  EXPECT_THAT(generateIdentifierTrigrams("abc_defGhij__klm"),
-  trigramsAre({
-  "abc", "abd", "abg", "ade", "adg", "adk", "agh", "agk", "bcd",
-  "bcg", "bde", "bdg", "bdk", "bgh", "bgk", "cde", "cdg", "cdk",
-  "cgh", "cgk", "def", "deg", "dek", "dgh", "dgk", "dkl", "efg",
-  "efk", "egh", "egk", "ekl", "fgh", "fgk", "fkl", "ghi", "ghk",
-  "gkl", "hij", "hik", "hkl", "ijk", "ikl", "jkl", "klm",
-  }));
+  EXPECT_THAT(
+  generateIdentifierTrigrams("abc_defGhij__klm"),
+  trigramsAre({"a$$", "abc", "abd", "abg", "ade", "adg", "adk", "agh",
+   "agk", "bcd", "bcg", "bde", "bdg", "bdk", "bgh", "bgk",
+   "cde", "cdg", "cdk", "cgh", "cgk", "def", "deg", "dek",
+   "dgh", "dgk", "dkl", "efg", "efk", "egh", "egk", "ekl",
+   "fgh", "fgk", "fkl", "ghi", "ghk", "gkl", "hij", "hik",
+   "hkl", "ijk", "ikl", "jkl", "klm", "ab$", "ad$", "$$$"}));
 }
 
 TEST(DexIndexTrigrams, QueryTrigrams) {
-  EXPECT_THAT(generateQueryTrigrams("X86"), trigramsAre({"x86"}));
+  EXPECT_THAT(generateQueryTrigrams("c"), trigramsAre({"c$$"}));
+  EXPECT_THAT(generateQueryTrigrams("cl"), trigramsAre({"cl$"}));
+  EXPECT_THAT(generateQueryTrigrams("cla"), trigramsAre({"cla"}));
+
+  EXPECT_THAT(generateQueryTrigrams("_"), trigramsAre({"_$$"}));
+  EXPECT_THAT(generateQueryTrigrams("__"), trigramsAre({"__$"}));
+  EXPECT_THAT(generateQueryTrigrams("___"), trigramsAre({"___"}));
 
-  EXPECT_THAT(generateQueryTrigrams("nl"), trigramsAre({}));
+  EXPECT_THAT(generateQueryTrigrams("X86"), trigramsAre({"x86"}));
 
   EXPECT_THAT(generateQueryTrigrams("clangd"),
   trigramsAre({"cla", "lan", "ang", "ngd"}));
Index: clang-tools-extra/trunk/clangd/index/dex/Iterator.h
===

[PATCH] D50337: [clangd] DexIndex implementation prototype

2018-08-13 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev planned changes to this revision.
kbobyrev added a comment.

As discussed offline, I should update the patch to reflect changes accepted in 
https://reviews.llvm.org/D50517.


https://reviews.llvm.org/D50337



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D50689: [clangd] NFC: Improve Dex Iterators debugging traits

2018-08-14 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev created this revision.
kbobyrev added reviewers: ioeric, ilya-biryukov.
kbobyrev added a project: clang-tools-extra.
Herald added subscribers: arphaman, jkorous, MaskRay.

https://reviews.llvm.org/D50689

Files:
  clang-tools-extra/clangd/index/dex/Iterator.cpp
  clang-tools-extra/clangd/index/dex/Iterator.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp


Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -231,13 +231,14 @@
   const PostingList L4 = {0, 1, 5};
   const PostingList L5;
 
-  EXPECT_EQ(llvm::to_string(*(create(L0))), "[4, 7, 8, 20, 42, 100]");
+  EXPECT_EQ(llvm::to_string(*(create(L0))), "[{4}, 7, 8, 20, 42, 100, END]");
 
   auto Nested = createAnd(createAnd(create(L1), create(L2)),
   createOr(create(L3), create(L4), create(L5)));
 
   EXPECT_EQ(llvm::to_string(*Nested),
-"(& (& [1, 3, 5, 8, 9] [1, 5, 7, 9]) (| [0, 5] [0, 1, 5] []))");
+"(& (& [{1}, 3, 5, 8, 9, END] [{1}, 5, 7, 9, END]) (| [0, {5}, "
+"END] [0, {1}, 5, END] [{END}]))");
 }
 
 TEST(DexIndexIterators, Limit) {
Index: clang-tools-extra/clangd/index/dex/Iterator.h
===
--- clang-tools-extra/clangd/index/dex/Iterator.h
+++ clang-tools-extra/clangd/index/dex/Iterator.h
@@ -99,7 +99,9 @@
   ///
   /// Where Type is the iterator type representation: "&" for And, "|" for Or,
   /// ChildN is N-th iterator child. Raw iterators over PostingList are
-  /// represented as "[ID1, ID2, ...]" where IDN is N-th PostingList entry.
+  /// represented as "[ID1, ID2, ..., {IDX}, ... END]" where IDN is N-th
+  /// PostingList entry and IDX is the one currently being pointed to by the
+  /// corresponding iterator.
   friend llvm::raw_ostream &operator<<(llvm::raw_ostream &OS,
const Iterator &Iterator) {
 return Iterator.dump(OS);
Index: clang-tools-extra/clangd/index/dex/Iterator.cpp
===
--- clang-tools-extra/clangd/index/dex/Iterator.cpp
+++ clang-tools-extra/clangd/index/dex/Iterator.cpp
@@ -49,10 +49,19 @@
   llvm::raw_ostream &dump(llvm::raw_ostream &OS) const override {
 OS << '[';
 auto Separator = "";
-for (const auto &ID : Documents) {
-  OS << Separator << ID;
+for (auto It = std::begin(Documents); It != std::end(Documents); ++It) {
+  OS << Separator;
+  if (It == Index)
+OS << '{' << *It << '}';
+  else
+OS << *It;
   Separator = ", ";
 }
+OS << Separator;
+if (Index == std::end(Documents))
+  OS << "{END}";
+else
+  OS << "END";
 OS << ']';
 return OS;
   }


Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -231,13 +231,14 @@
   const PostingList L4 = {0, 1, 5};
   const PostingList L5;
 
-  EXPECT_EQ(llvm::to_string(*(create(L0))), "[4, 7, 8, 20, 42, 100]");
+  EXPECT_EQ(llvm::to_string(*(create(L0))), "[{4}, 7, 8, 20, 42, 100, END]");
 
   auto Nested = createAnd(createAnd(create(L1), create(L2)),
   createOr(create(L3), create(L4), create(L5)));
 
   EXPECT_EQ(llvm::to_string(*Nested),
-"(& (& [1, 3, 5, 8, 9] [1, 5, 7, 9]) (| [0, 5] [0, 1, 5] []))");
+"(& (& [{1}, 3, 5, 8, 9, END] [{1}, 5, 7, 9, END]) (| [0, {5}, "
+"END] [0, {1}, 5, END] [{END}]))");
 }
 
 TEST(DexIndexIterators, Limit) {
Index: clang-tools-extra/clangd/index/dex/Iterator.h
===
--- clang-tools-extra/clangd/index/dex/Iterator.h
+++ clang-tools-extra/clangd/index/dex/Iterator.h
@@ -99,7 +99,9 @@
   ///
   /// Where Type is the iterator type representation: "&" for And, "|" for Or,
   /// ChildN is N-th iterator child. Raw iterators over PostingList are
-  /// represented as "[ID1, ID2, ...]" where IDN is N-th PostingList entry.
+  /// represented as "[ID1, ID2, ..., {IDX}, ... END]" where IDN is N-th
+  /// PostingList entry and IDX is the one currently being pointed to by the
+  /// corresponding iterator.
   friend llvm::raw_ostream &operator<<(llvm::raw_ostream &OS,
const Iterator &Iterator) {
 return Iterator.dump(OS);
Index: clang-tools-extra/clangd/index/dex/Iterator.cpp
===
--- clang-tools-extra/clangd/index/dex/Iterator.cpp
+++ clang-tools-extra/clangd/index/dex/Iterator.cpp
@@ -49,10 +49,19 @@
   llvm::raw_ostream &dump(llvm::raw_ostream &OS) const override {
 OS << '[';
 auto Separator = "";
-for (const aut

[PATCH] D50700: [clangd] Generate better incomplete bigrams for the Dex index

2018-08-14 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev created this revision.
kbobyrev added reviewers: ioeric, ilya-biryukov.
kbobyrev added a project: clang-tools-extra.
Herald added subscribers: arphaman, jkorous, MaskRay.

Currently, the query trigram generator would simply yield `u_p` trigram for the 
`u_p` query. This is not optimal, since the user is likely to try matching two 
heads with this query and this patch addresses the issue.


https://reviews.llvm.org/D50700

Files:
  clang-tools-extra/clangd/index/dex/Trigram.cpp
  clang-tools-extra/clangd/index/dex/Trigram.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp


Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -300,6 +300,8 @@
   EXPECT_THAT(generateQueryTrigrams("__"), trigramsAre({"__$"}));
   EXPECT_THAT(generateQueryTrigrams("___"), trigramsAre({"___"}));
 
+  EXPECT_THAT(generateQueryTrigrams("u_p"), trigramsAre({"up$"}));
+
   EXPECT_THAT(generateQueryTrigrams("X86"), trigramsAre({"x86"}));
 
   EXPECT_THAT(generateQueryTrigrams("clangd"),
Index: clang-tools-extra/clangd/index/dex/Trigram.h
===
--- clang-tools-extra/clangd/index/dex/Trigram.h
+++ clang-tools-extra/clangd/index/dex/Trigram.h
@@ -68,7 +68,10 @@
 ///
 /// For short queries (less than 3 characters with Head or Tail roles in Fuzzy
 /// Matching segmentation) this returns a single trigram with the first
-/// characters (up to 3) to perfrom prefix match.
+/// characters (up to 3) to perfrom prefix match. However, if the query is 
short
+/// but it contains two HEAD symbols then the returned trigram would be an
+/// incomplete bigram with those two headsd. This would help to match
+/// "unique_ptr" and similar symbols with "u_p" query.
 std::vector generateQueryTrigrams(llvm::StringRef Query);
 
 } // namespace dex
Index: clang-tools-extra/clangd/index/dex/Trigram.cpp
===
--- clang-tools-extra/clangd/index/dex/Trigram.cpp
+++ clang-tools-extra/clangd/index/dex/Trigram.cpp
@@ -128,9 +128,15 @@
   // Additional pass is necessary to count valid identifier characters.
   // Depending on that, this function might return incomplete trigram.
   unsigned ValidSymbolsCount = 0;
-  for (size_t I = 0; I < Roles.size(); ++I)
-if (Roles[I] == Head || Roles[I] == Tail)
+  unsigned Heads = 0;
+  for (size_t I = 0; I < Roles.size(); ++I) {
+if (Roles[I] == Head) {
+  ++ValidSymbolsCount;
+  ++Heads;
+} else if (Roles[I] == Tail) {
   ++ValidSymbolsCount;
+}
+  }
 
   std::string LowercaseQuery = Query.lower();
 
@@ -140,13 +146,26 @@
   // If the number of symbols which can form fuzzy matching trigram is not
   // sufficient, generate a single incomplete trigram for query.
   if (ValidSymbolsCount < 3) {
-std::string Symbols = {{END_MARKER, END_MARKER, END_MARKER}};
+std::string Symbols;
+// If the query is not long enough to form a trigram but contains two heads
+// the returned trigram should be "xy$" where "x" and "y" are the heads.
+// This might be particulary important for cases like "u_p" to match
+// "unique_ptr" and similar symbols from the C++ Standard Library.
+if (Heads == 2) {
+  for (size_t I = 0; I < LowercaseQuery.size(); ++I)
+if (Roles[I] == Head)
+  Symbols += LowercaseQuery[I];
+
+  Symbols += END_MARKER;
+} else {
+  Symbols = {{END_MARKER, END_MARKER, END_MARKER}};
   if (LowercaseQuery.size() > 0)
 Symbols[0] = LowercaseQuery[0];
   if (LowercaseQuery.size() > 1)
 Symbols[1] = LowercaseQuery[1];
   if (LowercaseQuery.size() > 2)
 Symbols[2] = LowercaseQuery[2];
+}
 const auto Trigram = Token(Token::Kind::Trigram, Symbols);
 UniqueTrigrams.insert(Trigram);
   } else {


Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -300,6 +300,8 @@
   EXPECT_THAT(generateQueryTrigrams("__"), trigramsAre({"__$"}));
   EXPECT_THAT(generateQueryTrigrams("___"), trigramsAre({"___"}));
 
+  EXPECT_THAT(generateQueryTrigrams("u_p"), trigramsAre({"up$"}));
+
   EXPECT_THAT(generateQueryTrigrams("X86"), trigramsAre({"x86"}));
 
   EXPECT_THAT(generateQueryTrigrams("clangd"),
Index: clang-tools-extra/clangd/index/dex/Trigram.h
===
--- clang-tools-extra/clangd/index/dex/Trigram.h
+++ clang-tools-extra/clangd/index/dex/Trigram.h
@@ -68,7 +68,10 @@
 ///
 /// For short queries (less than 3 characters with Head or Tail roles in Fuzzy
 /// Matching segmentation) this returns a single trigram with the first
-/// characters (up to 3) 

[PATCH] D50700: [clangd] Generate better incomplete bigrams for the Dex index

2018-08-14 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 160555.
kbobyrev added a comment.

Treat leading underscores as additional signals and don't extract two heads in 
that case.


https://reviews.llvm.org/D50700

Files:
  clang-tools-extra/clangd/index/dex/Trigram.cpp
  clang-tools-extra/clangd/index/dex/Trigram.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp


Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -321,6 +321,9 @@
   EXPECT_THAT(generateQueryTrigrams("__"), trigramsAre({"__$"}));
   EXPECT_THAT(generateQueryTrigrams("___"), trigramsAre({"___"}));
 
+  EXPECT_THAT(generateQueryTrigrams("u_p"), trigramsAre({"up$"}));
+  EXPECT_THAT(generateQueryTrigrams("_u_p"), trigramsAre({"_u_"}));
+
   EXPECT_THAT(generateQueryTrigrams("X86"), trigramsAre({"x86"}));
 
   EXPECT_THAT(generateQueryTrigrams("clangd"),
Index: clang-tools-extra/clangd/index/dex/Trigram.h
===
--- clang-tools-extra/clangd/index/dex/Trigram.h
+++ clang-tools-extra/clangd/index/dex/Trigram.h
@@ -62,7 +62,11 @@
 ///
 /// For short queries (less than 3 characters with Head or Tail roles in Fuzzy
 /// Matching segmentation) this returns a single trigram with the first
-/// characters (up to 3) to perfrom prefix match.
+/// characters (up to 3) to perfrom prefix match. However, if the query is 
short
+/// but it contains two HEAD symbols then the returned trigram would be an
+/// incomplete bigram with those two HEADs (unless query starts with '_' which
+/// is treated as an additional information). This would help to match
+/// "unique_ptr" and similar symbols with "u_p" query
 std::vector generateQueryTrigrams(llvm::StringRef Query);
 
 } // namespace dex
Index: clang-tools-extra/clangd/index/dex/Trigram.cpp
===
--- clang-tools-extra/clangd/index/dex/Trigram.cpp
+++ clang-tools-extra/clangd/index/dex/Trigram.cpp
@@ -116,21 +116,39 @@
 
   // Additional pass is necessary to count valid identifier characters.
   // Depending on that, this function might return incomplete trigram.
+  unsigned Heads = 0;
   unsigned ValidSymbolsCount = 0;
-  for (size_t I = 0; I < Roles.size(); ++I)
-if (Roles[I] == Head || Roles[I] == Tail)
+  for (size_t I = 0; I < Roles.size(); ++I) {
+if (Roles[I] == Head) {
+  ++ValidSymbolsCount;
+  ++Heads;
+} else if (Roles[I] == Tail) {
   ++ValidSymbolsCount;
+}
+  }
 
   std::string LowercaseQuery = Query.lower();
 
   DenseSet UniqueTrigrams;
 
   // If the number of symbols which can form fuzzy matching trigram is not
   // sufficient, generate a single incomplete trigram for query.
   if (ValidSymbolsCount < 3) {
-std::string Chars =
-LowercaseQuery.substr(0, std::min(3UL, Query.size()));
-Chars.append(3 - Chars.size(), END_MARKER);
+std::string Chars;
+// If the query is not long enough to form a trigram but contains two heads
+// the returned trigram should be "xy$" where "x" and "y" are the heads.
+// This might be particulary important for cases like "u_p" to match
+// "unique_ptr" and similar symbols from the C++ Standard Library.
+if (Heads == 2 && !Query.startswith("_")) {
+  for (size_t I = 0; I < LowercaseQuery.size(); ++I)
+if (Roles[I] == Head)
+  Chars += LowercaseQuery[I];
+
+  Chars += END_MARKER;
+} else {
+  Chars = LowercaseQuery.substr(0, std::min(3UL, Query.size()));
+  Chars.append(3 - Chars.size(), END_MARKER);
+}
 UniqueTrigrams.insert(Token(Token::Kind::Trigram, Chars));
   } else {
 std::deque Chars;


Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -321,6 +321,9 @@
   EXPECT_THAT(generateQueryTrigrams("__"), trigramsAre({"__$"}));
   EXPECT_THAT(generateQueryTrigrams("___"), trigramsAre({"___"}));
 
+  EXPECT_THAT(generateQueryTrigrams("u_p"), trigramsAre({"up$"}));
+  EXPECT_THAT(generateQueryTrigrams("_u_p"), trigramsAre({"_u_"}));
+
   EXPECT_THAT(generateQueryTrigrams("X86"), trigramsAre({"x86"}));
 
   EXPECT_THAT(generateQueryTrigrams("clangd"),
Index: clang-tools-extra/clangd/index/dex/Trigram.h
===
--- clang-tools-extra/clangd/index/dex/Trigram.h
+++ clang-tools-extra/clangd/index/dex/Trigram.h
@@ -62,7 +62,11 @@
 ///
 /// For short queries (less than 3 characters with Head or Tail roles in Fuzzy
 /// Matching segmentation) this returns a single trigram with the first
-/// characters (up to 3) to perfrom prefix match.
+/// characters (up to 3) to perfrom prefix match. However, if the query is s

[PATCH] D50702: [clangd] NFC: Cleanup clangd help message

2018-08-14 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev created this revision.
kbobyrev added reviewers: ioeric, ilya-biryukov.
kbobyrev added a project: clang-tools-extra.
Herald added subscribers: arphaman, jkorous, MaskRay.

Add missed space, fix a typo.


https://reviews.llvm.org/D50702

Files:
  clang-tools-extra/clangd/tool/ClangdMain.cpp


Index: clang-tools-extra/clangd/tool/ClangdMain.cpp
===
--- clang-tools-extra/clangd/tool/ClangdMain.cpp
+++ clang-tools-extra/clangd/tool/ClangdMain.cpp
@@ -147,7 +147,7 @@
 static llvm::cl::opt EnableIndex(
 "index",
 llvm::cl::desc("Enable index-based features such as global code completion 
"
-   "and searching for symbols."
+   "and searching for symbols. "
"Clang uses an index built from symbols in opened files"),
 llvm::cl::init(true));
 
@@ -160,7 +160,7 @@
 static llvm::cl::opt HeaderInsertionDecorators(
 "header-insertion-decorators",
 llvm::cl::desc("Prepend a circular dot or space before the completion "
-   "label, depending on wether "
+   "label, depending on whether "
"an include line will be inserted or not."),
 llvm::cl::init(true));
 


Index: clang-tools-extra/clangd/tool/ClangdMain.cpp
===
--- clang-tools-extra/clangd/tool/ClangdMain.cpp
+++ clang-tools-extra/clangd/tool/ClangdMain.cpp
@@ -147,7 +147,7 @@
 static llvm::cl::opt EnableIndex(
 "index",
 llvm::cl::desc("Enable index-based features such as global code completion "
-   "and searching for symbols."
+   "and searching for symbols. "
"Clang uses an index built from symbols in opened files"),
 llvm::cl::init(true));
 
@@ -160,7 +160,7 @@
 static llvm::cl::opt HeaderInsertionDecorators(
 "header-insertion-decorators",
 llvm::cl::desc("Prepend a circular dot or space before the completion "
-   "label, depending on wether "
+   "label, depending on whether "
"an include line will be inserted or not."),
 llvm::cl::init(true));
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D50702: [clangd] NFC: Cleanup clangd help message

2018-08-14 Thread Kirill Bobyrev via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rCTE339673: [clangd] NFC: Cleanup clangd help message 
(authored by omtcyfz, committed by ).

Changed prior to commit:
  https://reviews.llvm.org/D50702?vs=160557&id=160558#toc

Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D50702

Files:
  clangd/tool/ClangdMain.cpp


Index: clangd/tool/ClangdMain.cpp
===
--- clangd/tool/ClangdMain.cpp
+++ clangd/tool/ClangdMain.cpp
@@ -147,7 +147,7 @@
 static llvm::cl::opt EnableIndex(
 "index",
 llvm::cl::desc("Enable index-based features such as global code completion 
"
-   "and searching for symbols."
+   "and searching for symbols. "
"Clang uses an index built from symbols in opened files"),
 llvm::cl::init(true));
 
@@ -160,7 +160,7 @@
 static llvm::cl::opt HeaderInsertionDecorators(
 "header-insertion-decorators",
 llvm::cl::desc("Prepend a circular dot or space before the completion "
-   "label, depending on wether "
+   "label, depending on whether "
"an include line will be inserted or not."),
 llvm::cl::init(true));
 


Index: clangd/tool/ClangdMain.cpp
===
--- clangd/tool/ClangdMain.cpp
+++ clangd/tool/ClangdMain.cpp
@@ -147,7 +147,7 @@
 static llvm::cl::opt EnableIndex(
 "index",
 llvm::cl::desc("Enable index-based features such as global code completion "
-   "and searching for symbols."
+   "and searching for symbols. "
"Clang uses an index built from symbols in opened files"),
 llvm::cl::init(true));
 
@@ -160,7 +160,7 @@
 static llvm::cl::opt HeaderInsertionDecorators(
 "header-insertion-decorators",
 llvm::cl::desc("Prepend a circular dot or space before the completion "
-   "label, depending on wether "
+   "label, depending on whether "
"an include line will be inserted or not."),
 llvm::cl::init(true));
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D50703: [clangd] NFC: Mark Workspace Symbol feature complete in the documentation

2018-08-14 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev created this revision.
kbobyrev added reviewers: ioeric, ilya-biryukov.
kbobyrev added a project: clang-tools-extra.
Herald added subscribers: arphaman, jkorous, MaskRay.

Workspace Symbol implementation was introduced in 
https://reviews.llvm.org/D44882 and should be complete now.


https://reviews.llvm.org/D50703

Files:
  clang-tools-extra/docs/clangd.rst


Index: clang-tools-extra/docs/clangd.rst
===
--- clang-tools-extra/docs/clangd.rst
+++ clang-tools-extra/docs/clangd.rst
@@ -64,7 +64,7 @@
 | Completion  | Yes|   Yes|
 +-++--+
 | Diagnostics | Yes|   Yes|
-+-++--+ 
++-++--+
 | Fix-its | Yes|   Yes|
 +-++--+
 | Go to Definition| Yes|   Yes|
@@ -83,7 +83,7 @@
 +-++--+
 | Document Symbols| Yes|   Yes|
 +-++--+
-| Workspace Symbols   | Yes|   No |
+| Workspace Symbols   | Yes|   Yes|
 +-++--+
 | Syntax and Semantic Coloring| No |   No |
 +-++--+


Index: clang-tools-extra/docs/clangd.rst
===
--- clang-tools-extra/docs/clangd.rst
+++ clang-tools-extra/docs/clangd.rst
@@ -64,7 +64,7 @@
 | Completion  | Yes|   Yes|
 +-++--+
 | Diagnostics | Yes|   Yes|
-+-++--+ 
++-++--+
 | Fix-its | Yes|   Yes|
 +-++--+
 | Go to Definition| Yes|   Yes|
@@ -83,7 +83,7 @@
 +-++--+
 | Document Symbols| Yes|   Yes|
 +-++--+
-| Workspace Symbols   | Yes|   No |
+| Workspace Symbols   | Yes|   Yes|
 +-++--+
 | Syntax and Semantic Coloring| No |   No |
 +-++--+
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D50337: [clangd] DexIndex implementation prototype

2018-08-14 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 160576.
kbobyrev added a comment.

Don't separate the logic for "long" and "short" queries: 
https://reviews.llvm.org/D50517 (https://reviews.llvm.org/rCTE339548) 
introduced incomplete trigrams which can be used on for "short" queries, too.


https://reviews.llvm.org/D50337

Files:
  clang-tools-extra/clangd/CMakeLists.txt
  clang-tools-extra/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/clangd/index/dex/DexIndex.h
  clang-tools-extra/clangd/index/dex/Token.h
  clang-tools-extra/unittests/clangd/CMakeLists.txt
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp
  clang-tools-extra/unittests/clangd/IndexTests.cpp
  clang-tools-extra/unittests/clangd/TestIndexOperations.cpp
  clang-tools-extra/unittests/clangd/TestIndexOperations.h

Index: clang-tools-extra/unittests/clangd/TestIndexOperations.cpp
===
--- /dev/null
+++ clang-tools-extra/unittests/clangd/TestIndexOperations.cpp
@@ -0,0 +1,89 @@
+//===-- IndexHelpers.cpp *- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "TestIndexOperations.h"
+
+namespace clang {
+namespace clangd {
+
+Symbol symbol(llvm::StringRef QName) {
+  Symbol Sym;
+  Sym.ID = SymbolID(QName.str());
+  size_t Pos = QName.rfind("::");
+  if (Pos == llvm::StringRef::npos) {
+Sym.Name = QName;
+Sym.Scope = "";
+  } else {
+Sym.Name = QName.substr(Pos + 2);
+Sym.Scope = QName.substr(0, Pos + 2);
+  }
+  return Sym;
+}
+
+// Create a slab of symbols with the given qualified names as both IDs and
+// names. The life time of the slab is managed by the returned shared pointer.
+// If \p WeakSymbols is provided, it will be pointed to the managed object in
+// the returned shared pointer.
+std::shared_ptr>
+generateSymbols(std::vector QualifiedNames,
+std::weak_ptr *WeakSymbols) {
+  SymbolSlab::Builder Slab;
+  for (llvm::StringRef QName : QualifiedNames)
+Slab.insert(symbol(QName));
+
+  auto Storage = std::make_shared();
+  Storage->Slab = std::move(Slab).build();
+  for (const auto &Sym : Storage->Slab)
+Storage->Pointers.push_back(&Sym);
+  if (WeakSymbols)
+*WeakSymbols = Storage;
+  auto *Pointers = &Storage->Pointers;
+  return {std::move(Storage), Pointers};
+}
+
+// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
+// to the `generateSymbols` above.
+std::shared_ptr>
+generateNumSymbols(int Begin, int End,
+   std::weak_ptr *WeakSymbols) {
+  std::vector Names;
+  for (int i = Begin; i <= End; i++)
+Names.push_back(std::to_string(i));
+  return generateSymbols(Names, WeakSymbols);
+}
+
+std::string getQualifiedName(const Symbol &Sym) {
+  return (Sym.Scope + Sym.Name).str();
+}
+
+std::vector match(const SymbolIndex &I,
+   const FuzzyFindRequest &Req, bool *Incomplete) {
+  std::vector Matches;
+  bool IsIncomplete = I.fuzzyFind(Req, [&](const Symbol &Sym) {
+Matches.push_back(clang::clangd::getQualifiedName(Sym));
+  });
+  if (Incomplete)
+*Incomplete = IsIncomplete;
+  return Matches;
+}
+
+// Returns qualified names of symbols with any of IDs in the index.
+std::vector lookup(const SymbolIndex &I,
+llvm::ArrayRef IDs) {
+  LookupRequest Req;
+  Req.IDs.insert(IDs.begin(), IDs.end());
+  std::vector Results;
+  I.lookup(Req, [&](const Symbol &Sym) {
+Results.push_back(getQualifiedName(Sym));
+  });
+  return Results;
+}
+
+} // namespace clangd
+} // namespace clang
Index: clang-tools-extra/unittests/clangd/TestIndexOperations.h
===
--- /dev/null
+++ clang-tools-extra/unittests/clangd/TestIndexOperations.h
@@ -0,0 +1,57 @@
+//===-- IndexHelpers.h --*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#ifndef LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_INDEXTESTCOMMON_H
+#define LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_INDEXTESTCOMMON_H
+
+#include "index/Index.h"
+#include "index/Merge.h"
+#include "index/dex/DexIndex.h"
+#include "index/dex/Iterator.h"
+#include "index/dex/Token.h"
+#include "index/dex/Trigram.h"
+
+namespace clang {
+namespace clangd {
+
+Symbol symbol(llvm::StringRef QName);
+
+struct SlabAndPointers {
+  SymbolSlab Slab;
+  std::vector Pointers;
+};
+
+// Create a slab of symbols with the given qualified names as both IDs and
+// names. The life time of the slab is managed by the retur

[PATCH] D50707: NFC: Enforce good formatting across multiple clang-tools-extra files

2018-08-14 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev created this revision.
kbobyrev added reviewers: ioeric, ilya-biryukov.
kbobyrev added a project: clang-tools-extra.
Herald added subscribers: arphaman, jkorous.

This patch improves readability of multiple files in clang-tools-extra and 
enforces LLVM Coding Guidelines.


https://reviews.llvm.org/D50707

Files:
  clang-tools-extra/clangd/ClangdLSPServer.cpp
  clang-tools-extra/clangd/ClangdLSPServer.h
  clang-tools-extra/clangd/ClangdUnit.cpp
  clang-tools-extra/clangd/ClangdUnit.h
  clang-tools-extra/clangd/CodeComplete.cpp
  clang-tools-extra/clangd/CodeComplete.h
  clang-tools-extra/clangd/CodeCompletionStrings.cpp
  clang-tools-extra/clangd/CodeCompletionStrings.h
  clang-tools-extra/clangd/Compiler.cpp
  clang-tools-extra/clangd/Compiler.h
  clang-tools-extra/clangd/Context.cpp
  clang-tools-extra/clangd/Diagnostics.cpp
  clang-tools-extra/clangd/Diagnostics.h
  clang-tools-extra/clangd/GlobalCompilationDatabase.cpp
  clang-tools-extra/clangd/GlobalCompilationDatabase.h
  clang-tools-extra/clangd/Quality.cpp
  clang-tools-extra/clangd/Quality.h
  clang-tools-extra/clangd/XRefs.cpp
  clang-tools-extra/clangd/XRefs.h
  clang-tools-extra/clangd/global-symbol-builder/GlobalSymbolBuilderMain.cpp
  clang-tools-extra/clangd/index/CanonicalIncludes.h
  clang-tools-extra/clangd/index/FileIndex.h
  clang-tools-extra/clangd/index/Index.h
  clang-tools-extra/clangd/index/Merge.cpp
  clang-tools-extra/clangd/index/Merge.h
  clang-tools-extra/clangd/index/SymbolYAML.h
  clang-tools-extra/clangd/index/dex/Iterator.h
  clang-tools-extra/clangd/index/dex/Token.h
  clang-tools-extra/clangd/index/dex/Trigram.h
  clang-tools-extra/modularize/ModuleAssistant.cpp
  clang-tools-extra/unittests/clangd/Annotations.cpp
  clang-tools-extra/unittests/clangd/Annotations.h
  clang-tools-extra/unittests/clangd/SyncAPI.cpp
  clang-tools-extra/unittests/clangd/SyncAPI.h
  clang-tools-extra/unittests/clangd/TestTU.cpp
  clang-tools-extra/unittests/clangd/TestTU.h

Index: clang-tools-extra/unittests/clangd/TestTU.h
===
--- clang-tools-extra/unittests/clangd/TestTU.h
+++ clang-tools-extra/unittests/clangd/TestTU.h
@@ -1,21 +1,23 @@
-//===--- TestTU.h - Scratch source files for testing *- C++-*-===//
+//===--- TestTU.h - Scratch source files for testing -*- C++-*-===//
 //
 // The LLVM Compiler Infrastructure
 //
 // This file is distributed under the University of Illinois Open Source
 // License. See LICENSE.TXT for details.
 //
-//===-===//
+//===--===//
 //
 // Many tests for indexing, code completion etc are most naturally expressed
 // using code examples.
 // TestTU lets test define these examples in a common way without dealing with
 // the mechanics of VFS and compiler interactions, and then easily grab the
 // AST, particular symbols, etc.
 //
 //===-===//
+
 #ifndef LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_TESTTU_H
 #define LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_TESTTU_H
+
 #include "ClangdUnit.h"
 #include "index/Index.h"
 #include "gtest/gtest.h"
@@ -64,4 +66,5 @@
 
 } // namespace clangd
 } // namespace clang
-#endif
+
+#endif // LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_TESTTU_H
Index: clang-tools-extra/unittests/clangd/TestTU.cpp
===
--- clang-tools-extra/unittests/clangd/TestTU.cpp
+++ clang-tools-extra/unittests/clangd/TestTU.cpp
@@ -5,7 +5,8 @@
 // This file is distributed under the University of Illinois Open Source
 // License. See LICENSE.TXT for details.
 //
-//===-===//
+//===--===//
+
 #include "TestTU.h"
 #include "TestFS.h"
 #include "index/FileIndex.h"
Index: clang-tools-extra/unittests/clangd/SyncAPI.h
===
--- clang-tools-extra/unittests/clangd/SyncAPI.h
+++ clang-tools-extra/unittests/clangd/SyncAPI.h
@@ -5,10 +5,14 @@
 // This file is distributed under the University of Illinois Open Source
 // License. See LICENSE.TXT for details.
 //
-//===-===//
+//===--===//
+//
 // This file contains synchronous versions of ClangdServer's async API. We
 // deliberately don't expose the sync API outside tests to encourage using the
 // async versions in clangd code.
+//
+//===--===//
+
 #ifndef LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_SYNCAPI_H
 #define LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_SYNCAPI_H
 
@@ -49,4 +53,4 @@
 } // namespace clangd
 } 

[PATCH] D50707: NFC: Enforce good formatting across multiple clang-tools-extra files

2018-08-14 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 160605.
kbobyrev added a comment.

I have updated the patch so that it only affects comments, header guards and 
inserts few newlines. Actual source code is not affected so that `git blame` 
log could be less cryptic.


https://reviews.llvm.org/D50707

Files:
  clang-tools-extra/clangd/ClangdLSPServer.cpp
  clang-tools-extra/clangd/ClangdLSPServer.h
  clang-tools-extra/clangd/ClangdUnit.cpp
  clang-tools-extra/clangd/ClangdUnit.h
  clang-tools-extra/clangd/CodeComplete.cpp
  clang-tools-extra/clangd/CodeComplete.h
  clang-tools-extra/clangd/CodeCompletionStrings.cpp
  clang-tools-extra/clangd/CodeCompletionStrings.h
  clang-tools-extra/clangd/Compiler.cpp
  clang-tools-extra/clangd/Compiler.h
  clang-tools-extra/clangd/Context.cpp
  clang-tools-extra/clangd/Diagnostics.cpp
  clang-tools-extra/clangd/Diagnostics.h
  clang-tools-extra/clangd/GlobalCompilationDatabase.cpp
  clang-tools-extra/clangd/GlobalCompilationDatabase.h
  clang-tools-extra/clangd/Quality.cpp
  clang-tools-extra/clangd/Quality.h
  clang-tools-extra/clangd/XRefs.cpp
  clang-tools-extra/clangd/XRefs.h
  clang-tools-extra/clangd/global-symbol-builder/GlobalSymbolBuilderMain.cpp
  clang-tools-extra/clangd/index/CanonicalIncludes.h
  clang-tools-extra/clangd/index/FileIndex.h
  clang-tools-extra/clangd/index/Index.h
  clang-tools-extra/clangd/index/Merge.cpp
  clang-tools-extra/clangd/index/Merge.h
  clang-tools-extra/clangd/index/SymbolYAML.h
  clang-tools-extra/clangd/index/dex/Iterator.h
  clang-tools-extra/clangd/index/dex/Token.h
  clang-tools-extra/clangd/index/dex/Trigram.h
  clang-tools-extra/modularize/ModuleAssistant.cpp
  clang-tools-extra/unittests/clangd/Annotations.cpp
  clang-tools-extra/unittests/clangd/Annotations.h
  clang-tools-extra/unittests/clangd/SyncAPI.cpp
  clang-tools-extra/unittests/clangd/SyncAPI.h
  clang-tools-extra/unittests/clangd/TestTU.cpp
  clang-tools-extra/unittests/clangd/TestTU.h

Index: clang-tools-extra/unittests/clangd/TestTU.h
===
--- clang-tools-extra/unittests/clangd/TestTU.h
+++ clang-tools-extra/unittests/clangd/TestTU.h
@@ -1,21 +1,23 @@
-//===--- TestTU.h - Scratch source files for testing *- C++-*-===//
+//===--- TestTU.h - Scratch source files for testing -*- C++-*-===//
 //
 // The LLVM Compiler Infrastructure
 //
 // This file is distributed under the University of Illinois Open Source
 // License. See LICENSE.TXT for details.
 //
-//===-===//
+//===--===//
 //
 // Many tests for indexing, code completion etc are most naturally expressed
 // using code examples.
 // TestTU lets test define these examples in a common way without dealing with
 // the mechanics of VFS and compiler interactions, and then easily grab the
 // AST, particular symbols, etc.
 //
 //===-===//
+
 #ifndef LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_TESTTU_H
 #define LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_TESTTU_H
+
 #include "ClangdUnit.h"
 #include "index/Index.h"
 #include "gtest/gtest.h"
@@ -64,4 +66,5 @@
 
 } // namespace clangd
 } // namespace clang
-#endif
+
+#endif // LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_TESTTU_H
Index: clang-tools-extra/unittests/clangd/TestTU.cpp
===
--- clang-tools-extra/unittests/clangd/TestTU.cpp
+++ clang-tools-extra/unittests/clangd/TestTU.cpp
@@ -5,7 +5,8 @@
 // This file is distributed under the University of Illinois Open Source
 // License. See LICENSE.TXT for details.
 //
-//===-===//
+//===--===//
+
 #include "TestTU.h"
 #include "TestFS.h"
 #include "index/FileIndex.h"
Index: clang-tools-extra/unittests/clangd/SyncAPI.h
===
--- clang-tools-extra/unittests/clangd/SyncAPI.h
+++ clang-tools-extra/unittests/clangd/SyncAPI.h
@@ -5,10 +5,14 @@
 // This file is distributed under the University of Illinois Open Source
 // License. See LICENSE.TXT for details.
 //
-//===-===//
+//===--===//
+//
 // This file contains synchronous versions of ClangdServer's async API. We
 // deliberately don't expose the sync API outside tests to encourage using the
 // async versions in clangd code.
+//
+//===--===//
+
 #ifndef LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_SYNCAPI_H
 #define LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_SYNCAPI_H
 
@@ -49,4 +53,4 @@
 } // namespace clangd
 } // namespace clang
 
-#e

[PATCH] D50707: NFC: Enforce good formatting across multiple clang-tools-extra files

2018-08-14 Thread Kirill Bobyrev via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rCTE339687: NFC: Enforce good formatting across multiple 
clang-tools-extra files (authored by omtcyfz, committed by ).

Changed prior to commit:
  https://reviews.llvm.org/D50707?vs=160605&id=160607#toc

Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D50707

Files:
  clangd/ClangdLSPServer.cpp
  clangd/ClangdLSPServer.h
  clangd/ClangdUnit.cpp
  clangd/ClangdUnit.h
  clangd/CodeComplete.cpp
  clangd/CodeComplete.h
  clangd/CodeCompletionStrings.cpp
  clangd/CodeCompletionStrings.h
  clangd/Compiler.cpp
  clangd/Compiler.h
  clangd/Context.cpp
  clangd/Diagnostics.cpp
  clangd/Diagnostics.h
  clangd/GlobalCompilationDatabase.cpp
  clangd/GlobalCompilationDatabase.h
  clangd/Quality.cpp
  clangd/Quality.h
  clangd/XRefs.cpp
  clangd/XRefs.h
  clangd/global-symbol-builder/GlobalSymbolBuilderMain.cpp
  clangd/index/CanonicalIncludes.h
  clangd/index/FileIndex.h
  clangd/index/Index.h
  clangd/index/Merge.cpp
  clangd/index/Merge.h
  clangd/index/SymbolYAML.h
  clangd/index/dex/Iterator.h
  clangd/index/dex/Token.h
  clangd/index/dex/Trigram.h
  modularize/ModuleAssistant.cpp
  unittests/clangd/Annotations.cpp
  unittests/clangd/Annotations.h
  unittests/clangd/SyncAPI.cpp
  unittests/clangd/SyncAPI.h
  unittests/clangd/TestTU.cpp
  unittests/clangd/TestTU.h

Index: clangd/Diagnostics.h
===
--- clangd/Diagnostics.h
+++ clangd/Diagnostics.h
@@ -1,11 +1,11 @@
-//===--- Diagnostics.h --*- C++-*-===//
+//===--- Diagnostics.h ---*- C++-*-===//
 //
 // The LLVM Compiler Infrastructure
 //
 // This file is distributed under the University of Illinois Open Source
 // License. See LICENSE.TXT for details.
 //
-//===-===//
+//===--===//
 
 #ifndef LLVM_CLANG_TOOLS_EXTRA_CLANGD_DIAGNOSTICS_H
 #define LLVM_CLANG_TOOLS_EXTRA_CLANGD_DIAGNOSTICS_H
@@ -101,4 +101,4 @@
 } // namespace clangd
 } // namespace clang
 
-#endif
+#endif // LLVM_CLANG_TOOLS_EXTRA_CLANGD_DIAGNOSTICS_H
Index: clangd/Compiler.cpp
===
--- clangd/Compiler.cpp
+++ clangd/Compiler.cpp
@@ -1,11 +1,11 @@
-//===--- Compiler.cpp ---*- C++-*-===//
+//===--- Compiler.cpp *- C++-*-===//
 //
 // The LLVM Compiler Infrastructure
 //
 // This file is distributed under the University of Illinois Open Source
 // License. See LICENSE.TXT for details.
 //
-//===-===//
+//===--===//
 
 #include "Compiler.h"
 #include "Logger.h"
Index: clangd/CodeComplete.h
===
--- clangd/CodeComplete.h
+++ clangd/CodeComplete.h
@@ -1,17 +1,18 @@
-//===--- CodeComplete.h -*- C++-*-===//
+//===--- CodeComplete.h --*- C++-*-===//
 //
 // The LLVM Compiler Infrastructure
 //
 // This file is distributed under the University of Illinois Open Source
 // License. See LICENSE.TXT for details.
 //
-//===-===//
+//===--===//
 //
 // Code completion provides suggestions for what the user might type next.
 // After "std::string S; S." we might suggest members of std::string.
 // Signature help describes the parameters of a function as you type them.
 //
-//===-===//
+//===--===//
+
 #ifndef LLVM_CLANG_TOOLS_EXTRA_CLANGD_CODECOMPLETE_H
 #define LLVM_CLANG_TOOLS_EXTRA_CLANGD_CODECOMPLETE_H
 
@@ -192,4 +193,4 @@
 } // namespace clangd
 } // namespace clang
 
-#endif
+#endif // LLVM_CLANG_TOOLS_EXTRA_CLANGD_CODECOMPLETE_H
Index: clangd/global-symbol-builder/GlobalSymbolBuilderMain.cpp
===
--- clangd/global-symbol-builder/GlobalSymbolBuilderMain.cpp
+++ clangd/global-symbol-builder/GlobalSymbolBuilderMain.cpp
@@ -11,7 +11,7 @@
 // whole project. This tools is for **experimental** only. Don't use it in
 // production code.
 //
-//===-===//
+//===--===//
 
 #include "index/CanonicalIncludes.h"
 #include "index/Index.h"
Index: clangd/ClangdLSPServer.cpp
==

[PATCH] D50839: [llvm] Optimize YAML::isNumeric

2018-08-16 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev created this revision.
kbobyrev added reviewers: ilya-biryukov, ioeric.
Herald added a reviewer: javed.absar.
Herald added a subscriber: kristof.beyls.

This patch significantly improves performance of the YAML serializer by 
optimizing `YAML::isNumeric` function. This function is called on the most 
strings and is highly inefficient for two reasons:

- It uses `Regex`, which is parsed and compiled each time this function is 
called
- It uses multiple passes which are not necessary

This patch introduces stateful ad hoc YAML number parser which does not rely on 
`Regex`. It also fixes YAML number format inconsistency: current implementation 
supports C-stile octal number format (`01234567`) which was present in YAML 1.0 
specialization (http://yaml.org/spec/1.0/), [Section 2.4. Tags, Example 2.19] 
but was deprecated and is no longer present in latest YAML 1.2 specification 
(http://yaml.org/spec/1.2/spec.html), see [Section 10.3.2. Tag Resolution]. 
Since the rest of the rest of the implementation does not support other 
deprecated YAML 1.0 numeric features such as sexagecimal numbers, commas as 
delimiters it is treated as inconsistency and not longer supported.

This performance bottleneck was identified while profiling Clangd's 
global-symbol-builder tool with my colleague @ilya-biryukov. The substantial 
part of the runtime was spent during a single-thread Reduce phase, which 
concludes with YAML serialization of collected symbol collection. Regex 
matching was accountable for approximately 45% of the whole runtime (which 
involves sharded Map phase), now it is reduced to 18% (which is spent in 
`clang::clangd::CanonicalIncludes` and can be also optimized because all used 
regexes are in fact either suffix matches or exact matches).

Benchmarking `global-symbol-builder` (using `hyperfine --warmup 2 --min-runs 5 
'command 1' 'command 2'` tool by processing a reasonable amount of code (26 
source files matched by `clang-tools-extra/clangd/*.cpp` with all transitive 
includes) confirmed our understanding of the performance bottleneck nature as 
it speeds up the command by the factor of 1.6x:

| Command| Mean [s]| Min…Max [s] |
| :---   | ---:| ---:|
| patch  | 84.7 ± 0.6  | 83.3…84.7   |
| master (https://reviews.llvm.org/rL339849) | 133.1 ± 0.8 | 132.4…134.6 |
|

Using smaller samples (e.g. by collecting symbols from 
`clang-tools-extra/clangd/AST.cpp` only) yields even better performance 
improvement, which is expected because Map phase takes less time compared to 
Reduce and is 2.05x faster:

| Command| Mean [ms]  | Min…Max [ms]  |
| :---   | ---:   | ---:  |
| patch  | 7607.6 ± 109.5 | 7533.3…7796.4 |
| master (https://reviews.llvm.org/rL339849) | 3702.2 ± 48.7  | 3635.1…3752.3 |


https://reviews.llvm.org/D50839

Files:
  llvm/include/llvm/Support/YAMLTraits.h
  llvm/unittests/Support/YAMLIOTest.cpp

Index: llvm/unittests/Support/YAMLIOTest.cpp
===
--- llvm/unittests/Support/YAMLIOTest.cpp
+++ llvm/unittests/Support/YAMLIOTest.cpp
@@ -16,16 +16,17 @@
 #include "gmock/gmock.h"
 #include "gtest/gtest.h"
 
+using llvm::yaml::Hex16;
+using llvm::yaml::Hex32;
+using llvm::yaml::Hex64;
+using llvm::yaml::Hex8;
 using llvm::yaml::Input;
-using llvm::yaml::Output;
 using llvm::yaml::IO;
-using llvm::yaml::MappingTraits;
+using llvm::yaml::isNumeric;
 using llvm::yaml::MappingNormalization;
+using llvm::yaml::MappingTraits;
+using llvm::yaml::Output;
 using llvm::yaml::ScalarTraits;
-using llvm::yaml::Hex8;
-using llvm::yaml::Hex16;
-using llvm::yaml::Hex32;
-using llvm::yaml::Hex64;
 using ::testing::StartsWith;
 
 
@@ -2569,3 +2570,63 @@
 TestEscaped((char const *)foobar, "\"foo\\u200Bbar\"");
   }
 }
+
+TEST(YAMLIO, Numeric) {
+  EXPECT_TRUE(isNumeric(".inf"));
+  EXPECT_TRUE(isNumeric(".INF"));
+  EXPECT_TRUE(isNumeric(".Inf"));
+  EXPECT_TRUE(isNumeric("-.inf"));
+  EXPECT_TRUE(isNumeric("+.inf"));
+
+  EXPECT_TRUE(isNumeric(".nan"));
+  EXPECT_TRUE(isNumeric(".NaN"));
+  EXPECT_TRUE(isNumeric(".NAN"));
+
+  EXPECT_TRUE(isNumeric("0"));
+  EXPECT_TRUE(isNumeric("0."));
+  EXPECT_TRUE(isNumeric("0.0"));
+  EXPECT_TRUE(isNumeric("-0.0"));
+  EXPECT_TRUE(isNumeric("+0.0"));
+
+  EXPECT_TRUE(isNumeric("+12.0"));
+  EXPECT_TRUE(isNumeric(".5"));
+  EXPECT_TRUE(isNumeric("+.5"));
+  EXPECT_TRUE(isNumeric("-1.0"));
+
+  EXPECT_TRUE(isNumeric("2.3e4"));
+  EXPECT_TRUE(isNumeric("-2E+05"));
+  EXPECT_TRUE(isNumeric("+12e03"));
+  EXPECT_TRUE(isNumeric("6.8523015e+5"));
+
+  EXPECT_TRUE(isNumeric("1.e+1"));
+  EXPECT_TRUE(isNumeric(".0e+1"));
+
+  EXPECT_TRUE(isNumeric("0x2aF3"));
+  EXPECT_TRUE(isNumeric("0o01234567"));
+
+  EXPECT_FALSE(isNumeric("not a number"));
+  EXPECT_FALSE(isNumeric("."));
+  EXPECT_FALSE(isNumeric(".e+1"));
+  EXPECT_FALSE(isNumeric(".1e"));
+  EXPECT_FALSE(isNumeric(".1e+"));
+  EXPECT_FALSE(isNumeric(".1e++1"));
+
+  EXPECT_FALSE(isNumeric("+0x2A

[PATCH] D50689: [clangd] NFC: Improve Dex Iterators debugging traits

2018-08-16 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 161013.
kbobyrev marked an inline comment as done.
kbobyrev added a comment.

Improved wording to prevent confusion: no more `IDX` (which is the one pointed 
to by the iterator) and `IDN`; just mention that the element being pointed to 
is the one enclosed in `{}` braces.


https://reviews.llvm.org/D50689

Files:
  clang-tools-extra/clangd/index/dex/Iterator.cpp
  clang-tools-extra/clangd/index/dex/Iterator.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp


Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -231,13 +231,14 @@
   const PostingList L4 = {0, 1, 5};
   const PostingList L5;
 
-  EXPECT_EQ(llvm::to_string(*(create(L0))), "[4, 7, 8, 20, 42, 100]");
+  EXPECT_EQ(llvm::to_string(*(create(L0))), "[{4}, 7, 8, 20, 42, 100, END]");
 
   auto Nested = createAnd(createAnd(create(L1), create(L2)),
   createOr(create(L3), create(L4), create(L5)));
 
   EXPECT_EQ(llvm::to_string(*Nested),
-"(& (& [1, 3, 5, 8, 9] [1, 5, 7, 9]) (| [0, 5] [0, 1, 5] []))");
+"(& (& [{1}, 3, 5, 8, 9, END] [{1}, 5, 7, 9, END]) (| [0, {5}, "
+"END] [0, {1}, 5, END] [{END}]))");
 }
 
 TEST(DexIndexIterators, Limit) {
Index: clang-tools-extra/clangd/index/dex/Iterator.h
===
--- clang-tools-extra/clangd/index/dex/Iterator.h
+++ clang-tools-extra/clangd/index/dex/Iterator.h
@@ -99,7 +99,9 @@
   ///
   /// Where Type is the iterator type representation: "&" for And, "|" for Or,
   /// ChildN is N-th iterator child. Raw iterators over PostingList are
-  /// represented as "[ID1, ID2, ...]" where IDN is N-th PostingList entry.
+  /// represented as "[ID1, ID2, ..., {IDN}, ... END]" where IDN is N-th
+  /// PostingList entry and the element which is pointed to by the PostingList
+  /// iterator is enclosed in {} braces.
   friend llvm::raw_ostream &operator<<(llvm::raw_ostream &OS,
const Iterator &Iterator) {
 return Iterator.dump(OS);
Index: clang-tools-extra/clangd/index/dex/Iterator.cpp
===
--- clang-tools-extra/clangd/index/dex/Iterator.cpp
+++ clang-tools-extra/clangd/index/dex/Iterator.cpp
@@ -49,10 +49,19 @@
   llvm::raw_ostream &dump(llvm::raw_ostream &OS) const override {
 OS << '[';
 auto Separator = "";
-for (const auto &ID : Documents) {
-  OS << Separator << ID;
+for (auto It = std::begin(Documents); It != std::end(Documents); ++It) {
+  OS << Separator;
+  if (It == Index)
+OS << '{' << *It << '}';
+  else
+OS << *It;
   Separator = ", ";
 }
+OS << Separator;
+if (Index == std::end(Documents))
+  OS << "{END}";
+else
+  OS << "END";
 OS << ']';
 return OS;
   }


Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -231,13 +231,14 @@
   const PostingList L4 = {0, 1, 5};
   const PostingList L5;
 
-  EXPECT_EQ(llvm::to_string(*(create(L0))), "[4, 7, 8, 20, 42, 100]");
+  EXPECT_EQ(llvm::to_string(*(create(L0))), "[{4}, 7, 8, 20, 42, 100, END]");
 
   auto Nested = createAnd(createAnd(create(L1), create(L2)),
   createOr(create(L3), create(L4), create(L5)));
 
   EXPECT_EQ(llvm::to_string(*Nested),
-"(& (& [1, 3, 5, 8, 9] [1, 5, 7, 9]) (| [0, 5] [0, 1, 5] []))");
+"(& (& [{1}, 3, 5, 8, 9, END] [{1}, 5, 7, 9, END]) (| [0, {5}, "
+"END] [0, {1}, 5, END] [{END}]))");
 }
 
 TEST(DexIndexIterators, Limit) {
Index: clang-tools-extra/clangd/index/dex/Iterator.h
===
--- clang-tools-extra/clangd/index/dex/Iterator.h
+++ clang-tools-extra/clangd/index/dex/Iterator.h
@@ -99,7 +99,9 @@
   ///
   /// Where Type is the iterator type representation: "&" for And, "|" for Or,
   /// ChildN is N-th iterator child. Raw iterators over PostingList are
-  /// represented as "[ID1, ID2, ...]" where IDN is N-th PostingList entry.
+  /// represented as "[ID1, ID2, ..., {IDN}, ... END]" where IDN is N-th
+  /// PostingList entry and the element which is pointed to by the PostingList
+  /// iterator is enclosed in {} braces.
   friend llvm::raw_ostream &operator<<(llvm::raw_ostream &OS,
const Iterator &Iterator) {
 return Iterator.dump(OS);
Index: clang-tools-extra/clangd/index/dex/Iterator.cpp
===
--- clang-tools-extra/clangd/index/dex/Iterator.cpp
+++ clang-tools-extra/clangd/index/dex/It

[PATCH] D50689: [clangd] NFC: Improve Dex Iterators debugging traits

2018-08-16 Thread Kirill Bobyrev via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rCTE339877: [clangd] NFC: Improve Dex Iterators debugging 
traits (authored by omtcyfz, committed by ).

Changed prior to commit:
  https://reviews.llvm.org/D50689?vs=161013&id=161014#toc

Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D50689

Files:
  clangd/index/dex/Iterator.cpp
  clangd/index/dex/Iterator.h
  unittests/clangd/DexIndexTests.cpp


Index: unittests/clangd/DexIndexTests.cpp
===
--- unittests/clangd/DexIndexTests.cpp
+++ unittests/clangd/DexIndexTests.cpp
@@ -231,13 +231,14 @@
   const PostingList L4 = {0, 1, 5};
   const PostingList L5;
 
-  EXPECT_EQ(llvm::to_string(*(create(L0))), "[4, 7, 8, 20, 42, 100]");
+  EXPECT_EQ(llvm::to_string(*(create(L0))), "[{4}, 7, 8, 20, 42, 100, END]");
 
   auto Nested = createAnd(createAnd(create(L1), create(L2)),
   createOr(create(L3), create(L4), create(L5)));
 
   EXPECT_EQ(llvm::to_string(*Nested),
-"(& (& [1, 3, 5, 8, 9] [1, 5, 7, 9]) (| [0, 5] [0, 1, 5] []))");
+"(& (& [{1}, 3, 5, 8, 9, END] [{1}, 5, 7, 9, END]) (| [0, {5}, "
+"END] [0, {1}, 5, END] [{END}]))");
 }
 
 TEST(DexIndexIterators, Limit) {
Index: clangd/index/dex/Iterator.cpp
===
--- clangd/index/dex/Iterator.cpp
+++ clangd/index/dex/Iterator.cpp
@@ -49,10 +49,19 @@
   llvm::raw_ostream &dump(llvm::raw_ostream &OS) const override {
 OS << '[';
 auto Separator = "";
-for (const auto &ID : Documents) {
-  OS << Separator << ID;
+for (auto It = std::begin(Documents); It != std::end(Documents); ++It) {
+  OS << Separator;
+  if (It == Index)
+OS << '{' << *It << '}';
+  else
+OS << *It;
   Separator = ", ";
 }
+OS << Separator;
+if (Index == std::end(Documents))
+  OS << "{END}";
+else
+  OS << "END";
 OS << ']';
 return OS;
   }
Index: clangd/index/dex/Iterator.h
===
--- clangd/index/dex/Iterator.h
+++ clangd/index/dex/Iterator.h
@@ -99,7 +99,9 @@
   ///
   /// Where Type is the iterator type representation: "&" for And, "|" for Or,
   /// ChildN is N-th iterator child. Raw iterators over PostingList are
-  /// represented as "[ID1, ID2, ...]" where IDN is N-th PostingList entry.
+  /// represented as "[ID1, ID2, ..., {IDN}, ... END]" where IDN is N-th
+  /// PostingList entry and the element which is pointed to by the PostingList
+  /// iterator is enclosed in {} braces.
   friend llvm::raw_ostream &operator<<(llvm::raw_ostream &OS,
const Iterator &Iterator) {
 return Iterator.dump(OS);


Index: unittests/clangd/DexIndexTests.cpp
===
--- unittests/clangd/DexIndexTests.cpp
+++ unittests/clangd/DexIndexTests.cpp
@@ -231,13 +231,14 @@
   const PostingList L4 = {0, 1, 5};
   const PostingList L5;
 
-  EXPECT_EQ(llvm::to_string(*(create(L0))), "[4, 7, 8, 20, 42, 100]");
+  EXPECT_EQ(llvm::to_string(*(create(L0))), "[{4}, 7, 8, 20, 42, 100, END]");
 
   auto Nested = createAnd(createAnd(create(L1), create(L2)),
   createOr(create(L3), create(L4), create(L5)));
 
   EXPECT_EQ(llvm::to_string(*Nested),
-"(& (& [1, 3, 5, 8, 9] [1, 5, 7, 9]) (| [0, 5] [0, 1, 5] []))");
+"(& (& [{1}, 3, 5, 8, 9, END] [{1}, 5, 7, 9, END]) (| [0, {5}, "
+"END] [0, {1}, 5, END] [{END}]))");
 }
 
 TEST(DexIndexIterators, Limit) {
Index: clangd/index/dex/Iterator.cpp
===
--- clangd/index/dex/Iterator.cpp
+++ clangd/index/dex/Iterator.cpp
@@ -49,10 +49,19 @@
   llvm::raw_ostream &dump(llvm::raw_ostream &OS) const override {
 OS << '[';
 auto Separator = "";
-for (const auto &ID : Documents) {
-  OS << Separator << ID;
+for (auto It = std::begin(Documents); It != std::end(Documents); ++It) {
+  OS << Separator;
+  if (It == Index)
+OS << '{' << *It << '}';
+  else
+OS << *It;
   Separator = ", ";
 }
+OS << Separator;
+if (Index == std::end(Documents))
+  OS << "{END}";
+else
+  OS << "END";
 OS << ']';
 return OS;
   }
Index: clangd/index/dex/Iterator.h
===
--- clangd/index/dex/Iterator.h
+++ clangd/index/dex/Iterator.h
@@ -99,7 +99,9 @@
   ///
   /// Where Type is the iterator type representation: "&" for And, "|" for Or,
   /// ChildN is N-th iterator child. Raw iterators over PostingList are
-  /// represented as "[ID1, ID2, ...]" where IDN is N-th PostingList entry.
+  /// represented as "[ID1, ID2, ..., {IDN}, ... END]" where IDN is N-th
+  /// PostingList entry and the element which is pointed to by the Postin

[PATCH] D50337: [clangd] DexIndex implementation prototype

2018-08-16 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 161017.
kbobyrev added a comment.

Sorry, the last diff was the old one. Should be correct now.


https://reviews.llvm.org/D50337

Files:
  clang-tools-extra/clangd/CMakeLists.txt
  clang-tools-extra/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/clangd/index/dex/DexIndex.h
  clang-tools-extra/clangd/index/dex/Token.h
  clang-tools-extra/unittests/clangd/CMakeLists.txt
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp
  clang-tools-extra/unittests/clangd/IndexTests.cpp
  clang-tools-extra/unittests/clangd/TestIndexOperations.cpp
  clang-tools-extra/unittests/clangd/TestIndexOperations.h

Index: clang-tools-extra/unittests/clangd/TestIndexOperations.h
===
--- /dev/null
+++ clang-tools-extra/unittests/clangd/TestIndexOperations.h
@@ -0,0 +1,57 @@
+//===-- IndexHelpers.h --*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#ifndef LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_INDEXTESTCOMMON_H
+#define LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_INDEXTESTCOMMON_H
+
+#include "index/Index.h"
+#include "index/Merge.h"
+#include "index/dex/DexIndex.h"
+#include "index/dex/Iterator.h"
+#include "index/dex/Token.h"
+#include "index/dex/Trigram.h"
+
+namespace clang {
+namespace clangd {
+
+Symbol symbol(llvm::StringRef QName);
+
+struct SlabAndPointers {
+  SymbolSlab Slab;
+  std::vector Pointers;
+};
+
+// Create a slab of symbols with the given qualified names as both IDs and
+// names. The life time of the slab is managed by the returned shared pointer.
+// If \p WeakSymbols is provided, it will be pointed to the managed object in
+// the returned shared pointer.
+std::shared_ptr>
+generateSymbols(std::vector QualifiedNames,
+std::weak_ptr *WeakSymbols = nullptr);
+
+// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
+// to the `generateSymbols` above.
+std::shared_ptr>
+generateNumSymbols(int Begin, int End,
+   std::weak_ptr *WeakSymbols = nullptr);
+
+std::string getQualifiedName(const Symbol &Sym);
+
+std::vector match(const SymbolIndex &I,
+   const FuzzyFindRequest &Req,
+   bool *Incomplete = nullptr);
+
+// Returns qualified names of symbols with any of IDs in the index.
+std::vector lookup(const SymbolIndex &I,
+llvm::ArrayRef IDs);
+
+} // namespace clangd
+} // namespace clang
+
+#endif
Index: clang-tools-extra/unittests/clangd/TestIndexOperations.cpp
===
--- /dev/null
+++ clang-tools-extra/unittests/clangd/TestIndexOperations.cpp
@@ -0,0 +1,89 @@
+//===-- IndexHelpers.cpp *- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "TestIndexOperations.h"
+
+namespace clang {
+namespace clangd {
+
+Symbol symbol(llvm::StringRef QName) {
+  Symbol Sym;
+  Sym.ID = SymbolID(QName.str());
+  size_t Pos = QName.rfind("::");
+  if (Pos == llvm::StringRef::npos) {
+Sym.Name = QName;
+Sym.Scope = "";
+  } else {
+Sym.Name = QName.substr(Pos + 2);
+Sym.Scope = QName.substr(0, Pos + 2);
+  }
+  return Sym;
+}
+
+// Create a slab of symbols with the given qualified names as both IDs and
+// names. The life time of the slab is managed by the returned shared pointer.
+// If \p WeakSymbols is provided, it will be pointed to the managed object in
+// the returned shared pointer.
+std::shared_ptr>
+generateSymbols(std::vector QualifiedNames,
+std::weak_ptr *WeakSymbols) {
+  SymbolSlab::Builder Slab;
+  for (llvm::StringRef QName : QualifiedNames)
+Slab.insert(symbol(QName));
+
+  auto Storage = std::make_shared();
+  Storage->Slab = std::move(Slab).build();
+  for (const auto &Sym : Storage->Slab)
+Storage->Pointers.push_back(&Sym);
+  if (WeakSymbols)
+*WeakSymbols = Storage;
+  auto *Pointers = &Storage->Pointers;
+  return {std::move(Storage), Pointers};
+}
+
+// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
+// to the `generateSymbols` above.
+std::shared_ptr>
+generateNumSymbols(int Begin, int End,
+   std::weak_ptr *WeakSymbols) {
+  std::vector Names;
+  for (int i = Begin; i <= End; i++)
+Names.push_back(std::to_string(i));
+  return generateSymbols(Names, WeakSymbols);
+}
+
+std::string getQualifiedName(const Symbol &Sym) {
+  return (Sym.Scope + Sym.Name).str()

[PATCH] D50700: [clangd] Generate better incomplete bigrams for the Dex index

2018-08-16 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev added inline comments.



Comment at: clang-tools-extra/unittests/clangd/DexIndexTests.cpp:324
 
+  EXPECT_THAT(generateQueryTrigrams("u_p"), trigramsAre({"up$"}));
+  EXPECT_THAT(generateQueryTrigrams("_u_p"), trigramsAre({"_u_"}));

ioeric wrote:
> I'm not sure if this is correct. If users have explicitly typed `_`, they are 
> likely to want a `_` there. You mentioned in the patch summary that users 
> might want to match two heads with this. Could you provide an example?
The particular example I had on my mind was `unique_ptr`. Effectively, if the 
query is `SC` then `StrCat` would be matched (because the incomplete trigram 
would be `sc$` and two heads from the symbol identifier would also yield 
`sc$`), but for `u_p`, `unique_ptr` is currently not matched.

On the other hand, if there's something like `m_field` and user types `mf` or 
`m_f` it will be matched in both cases, because `m_field` yields `mf$` in the 
index build stage, so this change doesn't decrease code completion quality for 
such cases.


https://reviews.llvm.org/D50700



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D50839: [llvm] Optimize YAML::isNumeric

2018-08-16 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 161027.
kbobyrev marked 2 inline comments as done.
kbobyrev added a subscriber: lebedev.ri.
kbobyrev added a comment.
Herald added a subscriber: mgorny.

Very good point by @lebedev.ri! I have added a very simple fuzzer for the 
parser. So far, there were no issues with the current implementation. I have 
not exposed the regexp matcher to the header, though, because it won't be used 
anywhere.


https://reviews.llvm.org/D50839

Files:
  llvm/include/llvm/Support/YAMLTraits.h
  llvm/tools/llvm-yaml-numeric-parser-fuzzer/CMakeLists.txt
  llvm/tools/llvm-yaml-numeric-parser-fuzzer/DummyYAMLNumericParserFuzzer.cpp
  llvm/tools/llvm-yaml-numeric-parser-fuzzer/yaml-numeric-parser-fuzzer.cpp
  llvm/unittests/Support/YAMLIOTest.cpp

Index: llvm/unittests/Support/YAMLIOTest.cpp
===
--- llvm/unittests/Support/YAMLIOTest.cpp
+++ llvm/unittests/Support/YAMLIOTest.cpp
@@ -16,16 +16,17 @@
 #include "gmock/gmock.h"
 #include "gtest/gtest.h"
 
+using llvm::yaml::Hex16;
+using llvm::yaml::Hex32;
+using llvm::yaml::Hex64;
+using llvm::yaml::Hex8;
 using llvm::yaml::Input;
-using llvm::yaml::Output;
 using llvm::yaml::IO;
-using llvm::yaml::MappingTraits;
+using llvm::yaml::isNumeric;
 using llvm::yaml::MappingNormalization;
+using llvm::yaml::MappingTraits;
+using llvm::yaml::Output;
 using llvm::yaml::ScalarTraits;
-using llvm::yaml::Hex8;
-using llvm::yaml::Hex16;
-using llvm::yaml::Hex32;
-using llvm::yaml::Hex64;
 using ::testing::StartsWith;
 
 
@@ -2569,3 +2570,64 @@
 TestEscaped((char const *)foobar, "\"foo\\u200Bbar\"");
   }
 }
+
+TEST(YAMLIO, Numeric) {
+  EXPECT_TRUE(isNumeric(".inf"));
+  EXPECT_TRUE(isNumeric(".INF"));
+  EXPECT_TRUE(isNumeric(".Inf"));
+  EXPECT_TRUE(isNumeric("-.inf"));
+  EXPECT_TRUE(isNumeric("+.inf"));
+
+  EXPECT_TRUE(isNumeric(".nan"));
+  EXPECT_TRUE(isNumeric(".NaN"));
+  EXPECT_TRUE(isNumeric(".NAN"));
+
+  EXPECT_TRUE(isNumeric("0"));
+  EXPECT_TRUE(isNumeric("0."));
+  EXPECT_TRUE(isNumeric("0.0"));
+  EXPECT_TRUE(isNumeric("-0.0"));
+  EXPECT_TRUE(isNumeric("+0.0"));
+
+  EXPECT_TRUE(isNumeric("12345"));
+  EXPECT_TRUE(isNumeric("012345"));
+  EXPECT_TRUE(isNumeric("+12.0"));
+  EXPECT_TRUE(isNumeric(".5"));
+  EXPECT_TRUE(isNumeric("+.5"));
+  EXPECT_TRUE(isNumeric("-1.0"));
+
+  EXPECT_TRUE(isNumeric("2.3e4"));
+  EXPECT_TRUE(isNumeric("-2E+05"));
+  EXPECT_TRUE(isNumeric("+12e03"));
+  EXPECT_TRUE(isNumeric("6.8523015e+5"));
+
+  EXPECT_TRUE(isNumeric("1.e+1"));
+  EXPECT_TRUE(isNumeric(".0e+1"));
+
+  EXPECT_TRUE(isNumeric("0x2aF3"));
+  EXPECT_TRUE(isNumeric("0o01234567"));
+
+  EXPECT_FALSE(isNumeric("not a number"));
+  EXPECT_FALSE(isNumeric("."));
+  EXPECT_FALSE(isNumeric(".e+1"));
+  EXPECT_FALSE(isNumeric(".1e"));
+  EXPECT_FALSE(isNumeric(".1e+"));
+  EXPECT_FALSE(isNumeric(".1e++1"));
+
+  EXPECT_FALSE(isNumeric("+0x2AF3"));
+  EXPECT_FALSE(isNumeric("-0x2AF3"));
+  EXPECT_FALSE(isNumeric("0x2AF3Z"));
+  EXPECT_FALSE(isNumeric("0o012345678"));
+  EXPECT_FALSE(isNumeric("-0o012345678"));
+  EXPECT_FALSE(isNumeric("03A8229434B839616A25C16B0291F77A438B"));
+
+  // Deprecated formats: as for YAML 1.2 specification, the following are not
+  // valid numbers anymore:
+  //
+  // * Sexagecimal numbers
+  // * Decimal numbers with comma s the delimiter
+  // * "inf", "nan" without '.' prefix
+  EXPECT_FALSE(isNumeric("3:25:45"));
+  EXPECT_FALSE(isNumeric("+12,345"));
+  EXPECT_FALSE(isNumeric("-inf"));
+  EXPECT_FALSE(isNumeric("1,230.15"));
+}
Index: llvm/tools/llvm-yaml-numeric-parser-fuzzer/yaml-numeric-parser-fuzzer.cpp
===
--- /dev/null
+++ llvm/tools/llvm-yaml-numeric-parser-fuzzer/yaml-numeric-parser-fuzzer.cpp
@@ -0,0 +1,44 @@
+//===--- special-case-list-fuzzer.cpp - Fuzzer for special case lists -===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "llvm/ADT/StringRef.h"
+#include "llvm/Support/Regex.h"
+#include "llvm/Support/YAMLTraits.h"
+#include 
+#include 
+
+llvm::Regex InifnityMatcher("^[-+]?(\\.inf|\\.Inf|\\.INF)$");
+llvm::Regex Base8("^0o[0-7]+$");
+llvm::Regex Base16("^0x[0-9a-fA-F]+$");
+llvm::Regex Float("^[-+]?(\\.[0-9]+|[0-9]+(\\.[0-9]*)?)([eE][-+]?[0-9]+)?$");
+
+inline bool isNumericRegex(llvm::StringRef S) {
+  if (S.equals(".nan") || S.equals(".NaN") || S.equals(".NAN"))
+return true;
+
+  if (InifnityMatcher.match(S))
+return true;
+
+  if (Base8.match(S))
+return true;
+
+  if (Base16.match(S))
+return true;
+
+  if (Float.match(S))
+return true;
+
+  return false;
+}
+
+extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
+  std::string Input(reinterpret_cast(Data), Size);
+  assert(llvm::yaml::isNum

[PATCH] D50839: [llvm] Optimize YAML::isNumeric

2018-08-16 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 161033.
kbobyrev added a comment.

Use consistent `Regex` matchers naming: don't append "Matcher" at the end.


https://reviews.llvm.org/D50839

Files:
  llvm/include/llvm/Support/YAMLTraits.h
  llvm/tools/llvm-yaml-numeric-parser-fuzzer/CMakeLists.txt
  llvm/tools/llvm-yaml-numeric-parser-fuzzer/DummyYAMLNumericParserFuzzer.cpp
  llvm/tools/llvm-yaml-numeric-parser-fuzzer/yaml-numeric-parser-fuzzer.cpp
  llvm/unittests/Support/YAMLIOTest.cpp

Index: llvm/unittests/Support/YAMLIOTest.cpp
===
--- llvm/unittests/Support/YAMLIOTest.cpp
+++ llvm/unittests/Support/YAMLIOTest.cpp
@@ -16,16 +16,17 @@
 #include "gmock/gmock.h"
 #include "gtest/gtest.h"
 
+using llvm::yaml::Hex16;
+using llvm::yaml::Hex32;
+using llvm::yaml::Hex64;
+using llvm::yaml::Hex8;
 using llvm::yaml::Input;
-using llvm::yaml::Output;
 using llvm::yaml::IO;
-using llvm::yaml::MappingTraits;
+using llvm::yaml::isNumeric;
 using llvm::yaml::MappingNormalization;
+using llvm::yaml::MappingTraits;
+using llvm::yaml::Output;
 using llvm::yaml::ScalarTraits;
-using llvm::yaml::Hex8;
-using llvm::yaml::Hex16;
-using llvm::yaml::Hex32;
-using llvm::yaml::Hex64;
 using ::testing::StartsWith;
 
 
@@ -2569,3 +2570,64 @@
 TestEscaped((char const *)foobar, "\"foo\\u200Bbar\"");
   }
 }
+
+TEST(YAMLIO, Numeric) {
+  EXPECT_TRUE(isNumeric(".inf"));
+  EXPECT_TRUE(isNumeric(".INF"));
+  EXPECT_TRUE(isNumeric(".Inf"));
+  EXPECT_TRUE(isNumeric("-.inf"));
+  EXPECT_TRUE(isNumeric("+.inf"));
+
+  EXPECT_TRUE(isNumeric(".nan"));
+  EXPECT_TRUE(isNumeric(".NaN"));
+  EXPECT_TRUE(isNumeric(".NAN"));
+
+  EXPECT_TRUE(isNumeric("0"));
+  EXPECT_TRUE(isNumeric("0."));
+  EXPECT_TRUE(isNumeric("0.0"));
+  EXPECT_TRUE(isNumeric("-0.0"));
+  EXPECT_TRUE(isNumeric("+0.0"));
+
+  EXPECT_TRUE(isNumeric("12345"));
+  EXPECT_TRUE(isNumeric("012345"));
+  EXPECT_TRUE(isNumeric("+12.0"));
+  EXPECT_TRUE(isNumeric(".5"));
+  EXPECT_TRUE(isNumeric("+.5"));
+  EXPECT_TRUE(isNumeric("-1.0"));
+
+  EXPECT_TRUE(isNumeric("2.3e4"));
+  EXPECT_TRUE(isNumeric("-2E+05"));
+  EXPECT_TRUE(isNumeric("+12e03"));
+  EXPECT_TRUE(isNumeric("6.8523015e+5"));
+
+  EXPECT_TRUE(isNumeric("1.e+1"));
+  EXPECT_TRUE(isNumeric(".0e+1"));
+
+  EXPECT_TRUE(isNumeric("0x2aF3"));
+  EXPECT_TRUE(isNumeric("0o01234567"));
+
+  EXPECT_FALSE(isNumeric("not a number"));
+  EXPECT_FALSE(isNumeric("."));
+  EXPECT_FALSE(isNumeric(".e+1"));
+  EXPECT_FALSE(isNumeric(".1e"));
+  EXPECT_FALSE(isNumeric(".1e+"));
+  EXPECT_FALSE(isNumeric(".1e++1"));
+
+  EXPECT_FALSE(isNumeric("+0x2AF3"));
+  EXPECT_FALSE(isNumeric("-0x2AF3"));
+  EXPECT_FALSE(isNumeric("0x2AF3Z"));
+  EXPECT_FALSE(isNumeric("0o012345678"));
+  EXPECT_FALSE(isNumeric("-0o012345678"));
+  EXPECT_FALSE(isNumeric("03A8229434B839616A25C16B0291F77A438B"));
+
+  // Deprecated formats: as for YAML 1.2 specification, the following are not
+  // valid numbers anymore:
+  //
+  // * Sexagecimal numbers
+  // * Decimal numbers with comma s the delimiter
+  // * "inf", "nan" without '.' prefix
+  EXPECT_FALSE(isNumeric("3:25:45"));
+  EXPECT_FALSE(isNumeric("+12,345"));
+  EXPECT_FALSE(isNumeric("-inf"));
+  EXPECT_FALSE(isNumeric("1,230.15"));
+}
Index: llvm/tools/llvm-yaml-numeric-parser-fuzzer/yaml-numeric-parser-fuzzer.cpp
===
--- /dev/null
+++ llvm/tools/llvm-yaml-numeric-parser-fuzzer/yaml-numeric-parser-fuzzer.cpp
@@ -0,0 +1,44 @@
+//===--- special-case-list-fuzzer.cpp - Fuzzer for special case lists -===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "llvm/ADT/StringRef.h"
+#include "llvm/Support/Regex.h"
+#include "llvm/Support/YAMLTraits.h"
+#include 
+#include 
+
+llvm::Regex Inifnity("^[-+]?(\\.inf|\\.Inf|\\.INF)$");
+llvm::Regex Base8("^0o[0-7]+$");
+llvm::Regex Base16("^0x[0-9a-fA-F]+$");
+llvm::Regex Float("^[-+]?(\\.[0-9]+|[0-9]+(\\.[0-9]*)?)([eE][-+]?[0-9]+)?$");
+
+inline bool isNumericRegex(llvm::StringRef S) {
+  if (S.equals(".nan") || S.equals(".NaN") || S.equals(".NAN"))
+return true;
+
+  if (Inifnity.match(S))
+return true;
+
+  if (Base8.match(S))
+return true;
+
+  if (Base16.match(S))
+return true;
+
+  if (Float.match(S))
+return true;
+
+  return false;
+}
+
+extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
+  std::string Input(reinterpret_cast(Data), Size);
+  assert(llvm::yaml::isNumeric(Input) == isNumericRegex(Input));
+  return 0;
+}
Index: llvm/tools/llvm-yaml-numeric-parser-fuzzer/DummyYAMLNumericParserFuzzer.cpp
===
--- /dev/null
+++ llvm/tools/llvm-yaml-numeric-parser-fuzzer/DummyYAMLNumericParserFuzzer.cpp
@

[PATCH] D50839: [llvm] Optimize YAML::isNumeric

2018-08-17 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 161191.
kbobyrev marked 4 inline comments as done.
kbobyrev added a comment.

Upload version which is IMO readable.


https://reviews.llvm.org/D50839

Files:
  llvm/include/llvm/Support/YAMLTraits.h
  llvm/tools/llvm-yaml-numeric-parser-fuzzer/CMakeLists.txt
  llvm/tools/llvm-yaml-numeric-parser-fuzzer/DummyYAMLNumericParserFuzzer.cpp
  llvm/tools/llvm-yaml-numeric-parser-fuzzer/yaml-numeric-parser-fuzzer.cpp
  llvm/unittests/Support/YAMLIOTest.cpp

Index: llvm/unittests/Support/YAMLIOTest.cpp
===
--- llvm/unittests/Support/YAMLIOTest.cpp
+++ llvm/unittests/Support/YAMLIOTest.cpp
@@ -16,16 +16,17 @@
 #include "gmock/gmock.h"
 #include "gtest/gtest.h"
 
+using llvm::yaml::Hex16;
+using llvm::yaml::Hex32;
+using llvm::yaml::Hex64;
+using llvm::yaml::Hex8;
 using llvm::yaml::Input;
-using llvm::yaml::Output;
 using llvm::yaml::IO;
-using llvm::yaml::MappingTraits;
+using llvm::yaml::isNumeric;
 using llvm::yaml::MappingNormalization;
+using llvm::yaml::MappingTraits;
+using llvm::yaml::Output;
 using llvm::yaml::ScalarTraits;
-using llvm::yaml::Hex8;
-using llvm::yaml::Hex16;
-using llvm::yaml::Hex32;
-using llvm::yaml::Hex64;
 using ::testing::StartsWith;
 
 
@@ -2569,3 +2570,66 @@
 TestEscaped((char const *)foobar, "\"foo\\u200Bbar\"");
   }
 }
+
+TEST(YAMLIO, Numeric) {
+  EXPECT_TRUE(isNumeric(".inf"));
+  EXPECT_TRUE(isNumeric(".INF"));
+  EXPECT_TRUE(isNumeric(".Inf"));
+  EXPECT_TRUE(isNumeric("-.inf"));
+  EXPECT_TRUE(isNumeric("+.inf"));
+
+  EXPECT_TRUE(isNumeric(".nan"));
+  EXPECT_TRUE(isNumeric(".NaN"));
+  EXPECT_TRUE(isNumeric(".NAN"));
+
+  EXPECT_TRUE(isNumeric("0"));
+  EXPECT_TRUE(isNumeric("0."));
+  EXPECT_TRUE(isNumeric("0.0"));
+  EXPECT_TRUE(isNumeric("-0.0"));
+  EXPECT_TRUE(isNumeric("+0.0"));
+
+  EXPECT_TRUE(isNumeric("12345"));
+  EXPECT_TRUE(isNumeric("012345"));
+  EXPECT_TRUE(isNumeric("+12.0"));
+  EXPECT_TRUE(isNumeric(".5"));
+  EXPECT_TRUE(isNumeric("+.5"));
+  EXPECT_TRUE(isNumeric("-1.0"));
+
+  EXPECT_TRUE(isNumeric("2.3e4"));
+  EXPECT_TRUE(isNumeric("-2E+05"));
+  EXPECT_TRUE(isNumeric("+12e03"));
+  EXPECT_TRUE(isNumeric("6.8523015e+5"));
+
+  EXPECT_TRUE(isNumeric("1.e+1"));
+  EXPECT_TRUE(isNumeric(".0e+1"));
+
+  EXPECT_TRUE(isNumeric("0x2aF3"));
+  EXPECT_TRUE(isNumeric("0o01234567"));
+
+  EXPECT_FALSE(isNumeric("not a number"));
+  EXPECT_FALSE(isNumeric("."));
+  EXPECT_FALSE(isNumeric(".e+1"));
+  EXPECT_FALSE(isNumeric(".1e"));
+  EXPECT_FALSE(isNumeric(".1e+"));
+  EXPECT_FALSE(isNumeric(".1e++1"));
+
+  EXPECT_FALSE(isNumeric("ABCD"));
+  EXPECT_FALSE(isNumeric("+0x2AF3"));
+  EXPECT_FALSE(isNumeric("-0x2AF3"));
+  EXPECT_FALSE(isNumeric("0x2AF3Z"));
+  EXPECT_FALSE(isNumeric("0o012345678"));
+  EXPECT_FALSE(isNumeric("0xZ"));
+  EXPECT_FALSE(isNumeric("-0o012345678"));
+  EXPECT_FALSE(isNumeric("03A8229434B839616A25C16B0291F77A438B"));
+
+  // Deprecated formats: as for YAML 1.2 specification, the following are not
+  // valid numbers anymore:
+  //
+  // * Sexagecimal numbers
+  // * Decimal numbers with comma s the delimiter
+  // * "inf", "nan" without '.' prefix
+  EXPECT_FALSE(isNumeric("3:25:45"));
+  EXPECT_FALSE(isNumeric("+12,345"));
+  EXPECT_FALSE(isNumeric("-inf"));
+  EXPECT_FALSE(isNumeric("1,230.15"));
+}
Index: llvm/tools/llvm-yaml-numeric-parser-fuzzer/yaml-numeric-parser-fuzzer.cpp
===
--- /dev/null
+++ llvm/tools/llvm-yaml-numeric-parser-fuzzer/yaml-numeric-parser-fuzzer.cpp
@@ -0,0 +1,44 @@
+//===--- special-case-list-fuzzer.cpp - Fuzzer for special case lists -===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "llvm/ADT/StringRef.h"
+#include "llvm/Support/Regex.h"
+#include "llvm/Support/YAMLTraits.h"
+#include 
+#include 
+
+llvm::Regex Infinity("^[-+]?(\\.inf|\\.Inf|\\.INF)$");
+llvm::Regex Base8("^0o[0-7]+$");
+llvm::Regex Base16("^0x[0-9a-fA-F]+$");
+llvm::Regex Float("^[-+]?(\\.[0-9]+|[0-9]+(\\.[0-9]*)?)([eE][-+]?[0-9]+)?$");
+
+inline bool isNumericRegex(llvm::StringRef S) {
+  if (S.equals(".nan") || S.equals(".NaN") || S.equals(".NAN"))
+return true;
+
+  if (Infinity.match(S))
+return true;
+
+  if (Base8.match(S))
+return true;
+
+  if (Base16.match(S))
+return true;
+
+  if (Float.match(S))
+return true;
+
+  return false;
+}
+
+extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
+  std::string Input(reinterpret_cast(Data), Size);
+  assert(llvm::yaml::isNumeric(Input) == isNumericRegex(Input));
+  return 0;
+}
Index: llvm/tools/llvm-yaml-numeric-parser-fuzzer/DummyYAMLNumericParserFuzzer.cpp
===
--- /dev/null
+++ 

[PATCH] D50839: [llvm] Optimize YAML::isNumeric

2018-08-17 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev added inline comments.



Comment at: llvm/include/llvm/Support/YAMLTraits.h:454
+inline bool isNumeric(StringRef S) {
+  if (S.empty())
+return false;

zturner wrote:
> What would happen if we re-wrote this entire function as:
> 
> ```
> inline bool isNumeric(StringRef S) {
>   uint64_t N;
>   int64_t I;
>   APFloat F;
>   return S.getAsInteger(N) || S.getAsInteger(I) || (F.convertFromString(S) == 
> opOK);
> }
> ```
> 
> Would this a) Be correct, and b) have similar performance characteristics to 
> what you've got here?
Thank you for the suggestion!

I have tried the proposed approach, but there are several caveats:

First, `APInt` (which I believe should be used in this case since YAML numbers 
are of arbitrary length) parsing does not look simpler than the current 
approach (and it's also unnecessary overhead and potentially some cases which 
are invalid in YAML but are perfectly fine in `APInt` parser). An example would 
be the prefix of octal numbers: `APInt` accepts `0` while it should be `0o` in 
YAML, so the `Radix` should be manually inferred anyway.

The main problem, however, is with the `APFloat` parser, which accepts a huge 
number of items which are not valid in YAML numeric format. Examples are:

* `.`
* `.e+1`
* `.e+`
* `.e`

Even worse, the parser appears to have bugs. I was able to find several classes 
of inputs which cause global-buffer-overflow caught by AddressSanitizer (e.g. 
`.+`). This should be investigated independently. However, the above cases lead 
me to believe that:

1) The LLVM parser is likely to have a huge number of cases which are invalid 
in YAML numeric format but are valid `APFloat`s. Finding all of these cases is 
non-trivial and is probably not rewarding.
2) The parser is unreliable.

What do you think?


https://reviews.llvm.org/D50839



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D50839: [llvm] Optimize YAML::isNumeric

2018-08-17 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 161192.
kbobyrev added a comment.

I tried to rewrite the loop, but IMO it looks even worse now.


https://reviews.llvm.org/D50839

Files:
  llvm/include/llvm/Support/YAMLTraits.h
  llvm/tools/llvm-yaml-numeric-parser-fuzzer/CMakeLists.txt
  llvm/tools/llvm-yaml-numeric-parser-fuzzer/DummyYAMLNumericParserFuzzer.cpp
  llvm/tools/llvm-yaml-numeric-parser-fuzzer/yaml-numeric-parser-fuzzer.cpp
  llvm/unittests/Support/YAMLIOTest.cpp

Index: llvm/unittests/Support/YAMLIOTest.cpp
===
--- llvm/unittests/Support/YAMLIOTest.cpp
+++ llvm/unittests/Support/YAMLIOTest.cpp
@@ -16,16 +16,17 @@
 #include "gmock/gmock.h"
 #include "gtest/gtest.h"
 
+using llvm::yaml::Hex16;
+using llvm::yaml::Hex32;
+using llvm::yaml::Hex64;
+using llvm::yaml::Hex8;
 using llvm::yaml::Input;
-using llvm::yaml::Output;
 using llvm::yaml::IO;
-using llvm::yaml::MappingTraits;
+using llvm::yaml::isNumeric;
 using llvm::yaml::MappingNormalization;
+using llvm::yaml::MappingTraits;
+using llvm::yaml::Output;
 using llvm::yaml::ScalarTraits;
-using llvm::yaml::Hex8;
-using llvm::yaml::Hex16;
-using llvm::yaml::Hex32;
-using llvm::yaml::Hex64;
 using ::testing::StartsWith;
 
 
@@ -2569,3 +2570,71 @@
 TestEscaped((char const *)foobar, "\"foo\\u200Bbar\"");
   }
 }
+
+TEST(YAMLIO, Numeric) {
+  EXPECT_TRUE(isNumeric(".inf"));
+  EXPECT_TRUE(isNumeric(".INF"));
+  EXPECT_TRUE(isNumeric(".Inf"));
+  EXPECT_TRUE(isNumeric("-.inf"));
+  EXPECT_TRUE(isNumeric("+.inf"));
+
+  EXPECT_TRUE(isNumeric(".nan"));
+  EXPECT_TRUE(isNumeric(".NaN"));
+  EXPECT_TRUE(isNumeric(".NAN"));
+
+  EXPECT_TRUE(isNumeric("0"));
+  EXPECT_TRUE(isNumeric("0."));
+  EXPECT_TRUE(isNumeric("0.0"));
+  EXPECT_TRUE(isNumeric("-0.0"));
+  EXPECT_TRUE(isNumeric("+0.0"));
+
+  EXPECT_TRUE(isNumeric("12345"));
+  EXPECT_TRUE(isNumeric("012345"));
+  EXPECT_TRUE(isNumeric("+12.0"));
+  EXPECT_TRUE(isNumeric(".5"));
+  EXPECT_TRUE(isNumeric("+.5"));
+  EXPECT_TRUE(isNumeric("-1.0"));
+
+  EXPECT_TRUE(isNumeric("2.3e4"));
+  EXPECT_TRUE(isNumeric("-2E+05"));
+  EXPECT_TRUE(isNumeric("+12e03"));
+  EXPECT_TRUE(isNumeric("6.8523015e+5"));
+
+  EXPECT_TRUE(isNumeric("1.e+1"));
+  EXPECT_TRUE(isNumeric(".0e+1"));
+
+  EXPECT_TRUE(isNumeric("0x2aF3"));
+  EXPECT_TRUE(isNumeric("0o01234567"));
+
+  EXPECT_FALSE(isNumeric("not a number"));
+  EXPECT_FALSE(isNumeric("."));
+  EXPECT_FALSE(isNumeric(".e+1"));
+  EXPECT_FALSE(isNumeric(".1e"));
+  EXPECT_FALSE(isNumeric(".1e+"));
+  EXPECT_FALSE(isNumeric(".1e++1"));
+
+  EXPECT_FALSE(isNumeric("ABCD"));
+  EXPECT_FALSE(isNumeric("+0x2AF3"));
+  EXPECT_FALSE(isNumeric("-0x2AF3"));
+  EXPECT_FALSE(isNumeric("0x2AF3Z"));
+  EXPECT_FALSE(isNumeric("0o012345678"));
+  EXPECT_FALSE(isNumeric("0xZ"));
+  EXPECT_FALSE(isNumeric("-0o012345678"));
+  EXPECT_FALSE(isNumeric("03A8229434B839616A25C16B0291F77A438B"));
+
+  EXPECT_FALSE(isNumeric("."));
+  EXPECT_FALSE(isNumeric(".e+1"));
+  EXPECT_FALSE(isNumeric(".e+"));
+  EXPECT_FALSE(isNumeric(".e"));
+
+  // Deprecated formats: as for YAML 1.2 specification, the following are not
+  // valid numbers anymore:
+  //
+  // * Sexagecimal numbers
+  // * Decimal numbers with comma s the delimiter
+  // * "inf", "nan" without '.' prefix
+  EXPECT_FALSE(isNumeric("3:25:45"));
+  EXPECT_FALSE(isNumeric("+12,345"));
+  EXPECT_FALSE(isNumeric("-inf"));
+  EXPECT_FALSE(isNumeric("1,230.15"));
+}
Index: llvm/tools/llvm-yaml-numeric-parser-fuzzer/yaml-numeric-parser-fuzzer.cpp
===
--- /dev/null
+++ llvm/tools/llvm-yaml-numeric-parser-fuzzer/yaml-numeric-parser-fuzzer.cpp
@@ -0,0 +1,45 @@
+//===--- special-case-list-fuzzer.cpp - Fuzzer for special case lists -===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "llvm/ADT/StringRef.h"
+#include "llvm/Support/Regex.h"
+#include "llvm/Support/YAMLTraits.h"
+#include 
+#include 
+
+llvm::Regex Infinity("^[-+]?(\\.inf|\\.Inf|\\.INF)$");
+llvm::Regex Base8("^0o[0-7]+$");
+llvm::Regex Base16("^0x[0-9a-fA-F]+$");
+llvm::Regex Float("^[-+]?(\\.[0-9]+|[0-9]+(\\.[0-9]*)?)([eE][-+]?[0-9]+)?$");
+
+inline bool isNumericRegex(llvm::StringRef S) {
+  if (S.equals(".nan") || S.equals(".NaN") || S.equals(".NAN"))
+return true;
+
+  if (Infinity.match(S))
+return true;
+
+  if (Base8.match(S))
+return true;
+
+  if (Base16.match(S))
+return true;
+
+  if (Float.match(S))
+return true;
+
+  return false;
+}
+
+extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
+  std::string Input(reinterpret_cast(Data), Size);
+  if (llvm::yaml::isNumeric(Input) != isNumericRegex(Input))
+__builtin_trap();
+  return 0;
+}
Index: llvm/tools/llvm-yaml-n

[PATCH] D50703: [clangd] NFC: Mark Workspace Symbol feature complete in the documentation

2018-08-17 Thread Kirill Bobyrev via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL340007: [clangd] NFC: Mark Workspace Symbol feature complete 
in the documentation (authored by omtcyfz, committed by ).
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D50703?vs=160566&id=161195#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D50703

Files:
  clang-tools-extra/trunk/docs/clangd.rst


Index: clang-tools-extra/trunk/docs/clangd.rst
===
--- clang-tools-extra/trunk/docs/clangd.rst
+++ clang-tools-extra/trunk/docs/clangd.rst
@@ -64,7 +64,7 @@
 | Completion  | Yes|   Yes|
 +-++--+
 | Diagnostics | Yes|   Yes|
-+-++--+ 
++-++--+
 | Fix-its | Yes|   Yes|
 +-++--+
 | Go to Definition| Yes|   Yes|
@@ -83,7 +83,7 @@
 +-++--+
 | Document Symbols| Yes|   Yes|
 +-++--+
-| Workspace Symbols   | Yes|   No |
+| Workspace Symbols   | Yes|   Yes|
 +-++--+
 | Syntax and Semantic Coloring| No |   No |
 +-++--+


Index: clang-tools-extra/trunk/docs/clangd.rst
===
--- clang-tools-extra/trunk/docs/clangd.rst
+++ clang-tools-extra/trunk/docs/clangd.rst
@@ -64,7 +64,7 @@
 | Completion  | Yes|   Yes|
 +-++--+
 | Diagnostics | Yes|   Yes|
-+-++--+ 
++-++--+
 | Fix-its | Yes|   Yes|
 +-++--+
 | Go to Definition| Yes|   Yes|
@@ -83,7 +83,7 @@
 +-++--+
 | Document Symbols| Yes|   Yes|
 +-++--+
-| Workspace Symbols   | Yes|   No |
+| Workspace Symbols   | Yes|   Yes|
 +-++--+
 | Syntax and Semantic Coloring| No |   No |
 +-++--+
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D50703: [clangd] NFC: Mark Workspace Symbol feature complete in the documentation

2018-08-17 Thread Kirill Bobyrev via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rCTE340007: [clangd] NFC: Mark Workspace Symbol feature 
complete in the documentation (authored by omtcyfz, committed by ).

Changed prior to commit:
  https://reviews.llvm.org/D50703?vs=160566&id=161196#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D50703

Files:
  docs/clangd.rst


Index: docs/clangd.rst
===
--- docs/clangd.rst
+++ docs/clangd.rst
@@ -64,7 +64,7 @@
 | Completion  | Yes|   Yes|
 +-++--+
 | Diagnostics | Yes|   Yes|
-+-++--+ 
++-++--+
 | Fix-its | Yes|   Yes|
 +-++--+
 | Go to Definition| Yes|   Yes|
@@ -83,7 +83,7 @@
 +-++--+
 | Document Symbols| Yes|   Yes|
 +-++--+
-| Workspace Symbols   | Yes|   No |
+| Workspace Symbols   | Yes|   Yes|
 +-++--+
 | Syntax and Semantic Coloring| No |   No |
 +-++--+


Index: docs/clangd.rst
===
--- docs/clangd.rst
+++ docs/clangd.rst
@@ -64,7 +64,7 @@
 | Completion  | Yes|   Yes|
 +-++--+
 | Diagnostics | Yes|   Yes|
-+-++--+ 
++-++--+
 | Fix-its | Yes|   Yes|
 +-++--+
 | Go to Definition| Yes|   Yes|
@@ -83,7 +83,7 @@
 +-++--+
 | Document Symbols| Yes|   Yes|
 +-++--+
-| Workspace Symbols   | Yes|   No |
+| Workspace Symbols   | Yes|   Yes|
 +-++--+
 | Syntax and Semantic Coloring| No |   No |
 +-++--+
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D50337: [clangd] DexIndex implementation prototype

2018-08-17 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 161202.
kbobyrev marked 6 inline comments as done.
kbobyrev added a comment.

Address a round of comments.


https://reviews.llvm.org/D50337

Files:
  clang-tools-extra/clangd/CMakeLists.txt
  clang-tools-extra/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/clangd/index/dex/DexIndex.h
  clang-tools-extra/clangd/index/dex/Token.h
  clang-tools-extra/unittests/clangd/CMakeLists.txt
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp
  clang-tools-extra/unittests/clangd/IndexTests.cpp
  clang-tools-extra/unittests/clangd/TestIndex.cpp
  clang-tools-extra/unittests/clangd/TestIndex.h

Index: clang-tools-extra/unittests/clangd/TestIndex.h
===
--- /dev/null
+++ clang-tools-extra/unittests/clangd/TestIndex.h
@@ -0,0 +1,57 @@
+//===-- IndexHelpers.h --*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#ifndef LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_INDEXTESTCOMMON_H
+#define LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_INDEXTESTCOMMON_H
+
+#include "index/Index.h"
+#include "index/Merge.h"
+#include "index/dex/DexIndex.h"
+#include "index/dex/Iterator.h"
+#include "index/dex/Token.h"
+#include "index/dex/Trigram.h"
+
+namespace clang {
+namespace clangd {
+
+Symbol symbol(llvm::StringRef QName);
+
+struct SlabAndPointers {
+  SymbolSlab Slab;
+  std::vector Pointers;
+};
+
+// Create a slab of symbols with the given qualified names as both IDs and
+// names. The life time of the slab is managed by the returned shared pointer.
+// If \p WeakSymbols is provided, it will be pointed to the managed object in
+// the returned shared pointer.
+std::shared_ptr>
+generateSymbols(std::vector QualifiedNames,
+std::weak_ptr *WeakSymbols = nullptr);
+
+// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
+// to the `generateSymbols` above.
+std::shared_ptr>
+generateNumSymbols(int Begin, int End,
+   std::weak_ptr *WeakSymbols = nullptr);
+
+std::string getQualifiedName(const Symbol &Sym);
+
+std::vector match(const SymbolIndex &I,
+   const FuzzyFindRequest &Req,
+   bool *Incomplete = nullptr);
+
+// Returns qualified names of symbols with any of IDs in the index.
+std::vector lookup(const SymbolIndex &I,
+llvm::ArrayRef IDs);
+
+} // namespace clangd
+} // namespace clang
+
+#endif
Index: clang-tools-extra/unittests/clangd/TestIndex.cpp
===
--- /dev/null
+++ clang-tools-extra/unittests/clangd/TestIndex.cpp
@@ -0,0 +1,89 @@
+//===-- IndexHelpers.cpp *- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "TestIndex.h"
+
+namespace clang {
+namespace clangd {
+
+Symbol symbol(llvm::StringRef QName) {
+  Symbol Sym;
+  Sym.ID = SymbolID(QName.str());
+  size_t Pos = QName.rfind("::");
+  if (Pos == llvm::StringRef::npos) {
+Sym.Name = QName;
+Sym.Scope = "";
+  } else {
+Sym.Name = QName.substr(Pos + 2);
+Sym.Scope = QName.substr(0, Pos + 2);
+  }
+  return Sym;
+}
+
+// Create a slab of symbols with the given qualified names as both IDs and
+// names. The life time of the slab is managed by the returned shared pointer.
+// If \p WeakSymbols is provided, it will be pointed to the managed object in
+// the returned shared pointer.
+std::shared_ptr>
+generateSymbols(std::vector QualifiedNames,
+std::weak_ptr *WeakSymbols) {
+  SymbolSlab::Builder Slab;
+  for (llvm::StringRef QName : QualifiedNames)
+Slab.insert(symbol(QName));
+
+  auto Storage = std::make_shared();
+  Storage->Slab = std::move(Slab).build();
+  for (const auto &Sym : Storage->Slab)
+Storage->Pointers.push_back(&Sym);
+  if (WeakSymbols)
+*WeakSymbols = Storage;
+  auto *Pointers = &Storage->Pointers;
+  return {std::move(Storage), Pointers};
+}
+
+// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
+// to the `generateSymbols` above.
+std::shared_ptr>
+generateNumSymbols(int Begin, int End,
+   std::weak_ptr *WeakSymbols) {
+  std::vector Names;
+  for (int i = Begin; i <= End; i++)
+Names.push_back(std::to_string(i));
+  return generateSymbols(Names, WeakSymbols);
+}
+
+std::string getQualifiedName(const Symbol &Sym) {
+  return (Sym.Scope + Sym.Name).str();
+}
+
+std::vector match(const SymbolIndex &I,
+  

[PATCH] D50839: [llvm] Optimize YAML::isNumeric

2018-08-17 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 161204.
kbobyrev added a comment.

Add couple tests, fix formatting issues, use `__builtin_trap()` instead of 
`assert` in fuzzer so that it's more transparent.

Also, fuzzing this unreadable version for a couple of hours suggests that it is 
valid.


https://reviews.llvm.org/D50839

Files:
  llvm/include/llvm/Support/YAMLTraits.h
  llvm/tools/llvm-yaml-numeric-parser-fuzzer/CMakeLists.txt
  llvm/tools/llvm-yaml-numeric-parser-fuzzer/DummyYAMLNumericParserFuzzer.cpp
  llvm/tools/llvm-yaml-numeric-parser-fuzzer/yaml-numeric-parser-fuzzer.cpp
  llvm/unittests/Support/YAMLIOTest.cpp

Index: llvm/unittests/Support/YAMLIOTest.cpp
===
--- llvm/unittests/Support/YAMLIOTest.cpp
+++ llvm/unittests/Support/YAMLIOTest.cpp
@@ -16,16 +16,17 @@
 #include "gmock/gmock.h"
 #include "gtest/gtest.h"
 
+using llvm::yaml::Hex16;
+using llvm::yaml::Hex32;
+using llvm::yaml::Hex64;
+using llvm::yaml::Hex8;
 using llvm::yaml::Input;
-using llvm::yaml::Output;
 using llvm::yaml::IO;
-using llvm::yaml::MappingTraits;
+using llvm::yaml::isNumeric;
 using llvm::yaml::MappingNormalization;
+using llvm::yaml::MappingTraits;
+using llvm::yaml::Output;
 using llvm::yaml::ScalarTraits;
-using llvm::yaml::Hex8;
-using llvm::yaml::Hex16;
-using llvm::yaml::Hex32;
-using llvm::yaml::Hex64;
 using ::testing::StartsWith;
 
 
@@ -2569,3 +2570,73 @@
 TestEscaped((char const *)foobar, "\"foo\\u200Bbar\"");
   }
 }
+
+TEST(YAMLIO, Numeric) {
+  EXPECT_TRUE(isNumeric(".inf"));
+  EXPECT_TRUE(isNumeric(".INF"));
+  EXPECT_TRUE(isNumeric(".Inf"));
+  EXPECT_TRUE(isNumeric("-.inf"));
+  EXPECT_TRUE(isNumeric("+.inf"));
+
+  EXPECT_TRUE(isNumeric(".nan"));
+  EXPECT_TRUE(isNumeric(".NaN"));
+  EXPECT_TRUE(isNumeric(".NAN"));
+
+  EXPECT_TRUE(isNumeric("0"));
+  EXPECT_TRUE(isNumeric("0."));
+  EXPECT_TRUE(isNumeric("0.0"));
+  EXPECT_TRUE(isNumeric("-0.0"));
+  EXPECT_TRUE(isNumeric("+0.0"));
+
+  EXPECT_TRUE(isNumeric("12345"));
+  EXPECT_TRUE(isNumeric("012345"));
+  EXPECT_TRUE(isNumeric("+12.0"));
+  EXPECT_TRUE(isNumeric(".5"));
+  EXPECT_TRUE(isNumeric("+.5"));
+  EXPECT_TRUE(isNumeric("-1.0"));
+
+  EXPECT_TRUE(isNumeric("2.3e4"));
+  EXPECT_TRUE(isNumeric("-2E+05"));
+  EXPECT_TRUE(isNumeric("+12e03"));
+  EXPECT_TRUE(isNumeric("6.8523015e+5"));
+
+  EXPECT_TRUE(isNumeric("1.e+1"));
+  EXPECT_TRUE(isNumeric(".0e+1"));
+
+  EXPECT_TRUE(isNumeric("0x2aF3"));
+  EXPECT_TRUE(isNumeric("0o01234567"));
+
+  EXPECT_FALSE(isNumeric("not a number"));
+  EXPECT_FALSE(isNumeric("."));
+  EXPECT_FALSE(isNumeric(".e+1"));
+  EXPECT_FALSE(isNumeric(".1e"));
+  EXPECT_FALSE(isNumeric(".1e+"));
+  EXPECT_FALSE(isNumeric(".1e++1"));
+
+  EXPECT_FALSE(isNumeric("ABCD"));
+  EXPECT_FALSE(isNumeric("+0x2AF3"));
+  EXPECT_FALSE(isNumeric("-0x2AF3"));
+  EXPECT_FALSE(isNumeric("0x2AF3Z"));
+  EXPECT_FALSE(isNumeric("0o012345678"));
+  EXPECT_FALSE(isNumeric("0xZ"));
+  EXPECT_FALSE(isNumeric("-0o012345678"));
+  EXPECT_FALSE(isNumeric("03A8229434B839616A25C16B0291F77A438B"));
+
+  EXPECT_FALSE(isNumeric(""));
+  EXPECT_FALSE(isNumeric("."));
+  EXPECT_FALSE(isNumeric(".e+1"));
+  EXPECT_FALSE(isNumeric(".e+"));
+  EXPECT_FALSE(isNumeric(".e"));
+  EXPECT_FALSE(isNumeric("e1"));
+
+  // Deprecated formats: as for YAML 1.2 specification, the following are not
+  // valid numbers anymore:
+  //
+  // * Sexagecimal numbers
+  // * Decimal numbers with comma s the delimiter
+  // * "inf", "nan" without '.' prefix
+  EXPECT_FALSE(isNumeric("3:25:45"));
+  EXPECT_FALSE(isNumeric("+12,345"));
+  EXPECT_FALSE(isNumeric("-inf"));
+  EXPECT_FALSE(isNumeric("1,230.15"));
+}
Index: llvm/tools/llvm-yaml-numeric-parser-fuzzer/yaml-numeric-parser-fuzzer.cpp
===
--- /dev/null
+++ llvm/tools/llvm-yaml-numeric-parser-fuzzer/yaml-numeric-parser-fuzzer.cpp
@@ -0,0 +1,47 @@
+//===--- special-case-list-fuzzer.cpp - Fuzzer for special case lists -===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "llvm/ADT/StringRef.h"
+#include "llvm/Support/Regex.h"
+#include "llvm/Support/YAMLTraits.h"
+#include 
+#include 
+
+llvm::Regex Infinity("^[-+]?(\\.inf|\\.Inf|\\.INF)$");
+llvm::Regex Base8("^0o[0-7]+$");
+llvm::Regex Base16("^0x[0-9a-fA-F]+$");
+llvm::Regex Float("^[-+]?(\\.[0-9]+|[0-9]+(\\.[0-9]*)?)([eE][-+]?[0-9]+)?$");
+
+inline bool isNumericRegex(llvm::StringRef S) {
+
+  if (S.equals(".nan") || S.equals(".NaN") || S.equals(".NAN"))
+return true;
+
+  if (Infinity.match(S))
+return true;
+
+  if (Base8.match(S))
+return true;
+
+  if (Base16.match(S))
+return true;
+
+  if (Float.match(S))
+return true;
+
+  return false;
+}
+
+extern "C" int LLVMFuzzerTestOneInp

[PATCH] D50337: [clangd] DexIndex implementation prototype

2018-08-17 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 161235.
kbobyrev marked 7 inline comments as done.
kbobyrev added a comment.

Address another round of comments.


https://reviews.llvm.org/D50337

Files:
  clang-tools-extra/clangd/CMakeLists.txt
  clang-tools-extra/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/clangd/index/dex/DexIndex.h
  clang-tools-extra/clangd/index/dex/Token.h
  clang-tools-extra/unittests/clangd/CMakeLists.txt
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp
  clang-tools-extra/unittests/clangd/IndexTests.cpp
  clang-tools-extra/unittests/clangd/TestIndex.cpp
  clang-tools-extra/unittests/clangd/TestIndex.h

Index: clang-tools-extra/unittests/clangd/TestIndex.h
===
--- /dev/null
+++ clang-tools-extra/unittests/clangd/TestIndex.h
@@ -0,0 +1,57 @@
+//===-- IndexHelpers.h --*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#ifndef LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_INDEXTESTCOMMON_H
+#define LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_INDEXTESTCOMMON_H
+
+#include "index/Index.h"
+#include "index/Merge.h"
+#include "index/dex/DexIndex.h"
+#include "index/dex/Iterator.h"
+#include "index/dex/Token.h"
+#include "index/dex/Trigram.h"
+
+namespace clang {
+namespace clangd {
+
+Symbol symbol(llvm::StringRef QName);
+
+struct SlabAndPointers {
+  SymbolSlab Slab;
+  std::vector Pointers;
+};
+
+// Create a slab of symbols with the given qualified names as both IDs and
+// names. The life time of the slab is managed by the returned shared pointer.
+// If \p WeakSymbols is provided, it will be pointed to the managed object in
+// the returned shared pointer.
+std::shared_ptr>
+generateSymbols(std::vector QualifiedNames,
+std::weak_ptr *WeakSymbols = nullptr);
+
+// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
+// to the `generateSymbols` above.
+std::shared_ptr>
+generateNumSymbols(int Begin, int End,
+   std::weak_ptr *WeakSymbols = nullptr);
+
+std::string getQualifiedName(const Symbol &Sym);
+
+std::vector match(const SymbolIndex &I,
+   const FuzzyFindRequest &Req,
+   bool *Incomplete = nullptr);
+
+// Returns qualified names of symbols with any of IDs in the index.
+std::vector lookup(const SymbolIndex &I,
+llvm::ArrayRef IDs);
+
+} // namespace clangd
+} // namespace clang
+
+#endif
Index: clang-tools-extra/unittests/clangd/TestIndex.cpp
===
--- /dev/null
+++ clang-tools-extra/unittests/clangd/TestIndex.cpp
@@ -0,0 +1,89 @@
+//===-- IndexHelpers.cpp *- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "TestIndex.h"
+
+namespace clang {
+namespace clangd {
+
+Symbol symbol(llvm::StringRef QName) {
+  Symbol Sym;
+  Sym.ID = SymbolID(QName.str());
+  size_t Pos = QName.rfind("::");
+  if (Pos == llvm::StringRef::npos) {
+Sym.Name = QName;
+Sym.Scope = "";
+  } else {
+Sym.Name = QName.substr(Pos + 2);
+Sym.Scope = QName.substr(0, Pos + 2);
+  }
+  return Sym;
+}
+
+// Create a slab of symbols with the given qualified names as both IDs and
+// names. The life time of the slab is managed by the returned shared pointer.
+// If \p WeakSymbols is provided, it will be pointed to the managed object in
+// the returned shared pointer.
+std::shared_ptr>
+generateSymbols(std::vector QualifiedNames,
+std::weak_ptr *WeakSymbols) {
+  SymbolSlab::Builder Slab;
+  for (llvm::StringRef QName : QualifiedNames)
+Slab.insert(symbol(QName));
+
+  auto Storage = std::make_shared();
+  Storage->Slab = std::move(Slab).build();
+  for (const auto &Sym : Storage->Slab)
+Storage->Pointers.push_back(&Sym);
+  if (WeakSymbols)
+*WeakSymbols = Storage;
+  auto *Pointers = &Storage->Pointers;
+  return {std::move(Storage), Pointers};
+}
+
+// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
+// to the `generateSymbols` above.
+std::shared_ptr>
+generateNumSymbols(int Begin, int End,
+   std::weak_ptr *WeakSymbols) {
+  std::vector Names;
+  for (int i = Begin; i <= End; i++)
+Names.push_back(std::to_string(i));
+  return generateSymbols(Names, WeakSymbols);
+}
+
+std::string getQualifiedName(const Symbol &Sym) {
+  return (Sym.Scope + Sym.Name).str();
+}
+
+std::vector match(const SymbolIndex &I,
+

[PATCH] D50897: [clangd] Allow using experimental Dex index

2018-08-17 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev created this revision.
kbobyrev added reviewers: ioeric, ilya-biryukov.
kbobyrev added a project: clang-tools-extra.
Herald added subscribers: arphaman, jkorous, MaskRay.

This patch adds hidden Clangd flag which replaces (currently) default 
`MemIndex` with `DexIndex` for the static index.


https://reviews.llvm.org/D50897

Files:
  clang-tools-extra/clangd/index/Index.h
  clang-tools-extra/clangd/index/MemIndex.cpp
  clang-tools-extra/clangd/index/MemIndex.h
  clang-tools-extra/clangd/tool/ClangdMain.cpp
  clang-tools-extra/unittests/clangd/CodeCompleteTests.cpp
  clang-tools-extra/unittests/clangd/TestTU.cpp

Index: clang-tools-extra/unittests/clangd/TestTU.cpp
===
--- clang-tools-extra/unittests/clangd/TestTU.cpp
+++ clang-tools-extra/unittests/clangd/TestTU.cpp
@@ -49,7 +49,7 @@
 }
 
 std::unique_ptr TestTU::index() const {
-  return MemIndex::build(headerSymbols());
+  return SymbolIndex::build(headerSymbols());
 }
 
 const Symbol &findSymbol(const SymbolSlab &Slab, llvm::StringRef QName) {
Index: clang-tools-extra/unittests/clangd/CodeCompleteTests.cpp
===
--- clang-tools-extra/unittests/clangd/CodeCompleteTests.cpp
+++ clang-tools-extra/unittests/clangd/CodeCompleteTests.cpp
@@ -76,7 +76,7 @@
   SymbolSlab::Builder Slab;
   for (const auto &Sym : Symbols)
 Slab.insert(Sym);
-  return MemIndex::build(std::move(Slab).build());
+  return SymbolIndex::build(std::move(Slab).build());
 }
 
 CodeCompleteResult completions(ClangdServer &Server, StringRef TestCode,
Index: clang-tools-extra/clangd/tool/ClangdMain.cpp
===
--- clang-tools-extra/clangd/tool/ClangdMain.cpp
+++ clang-tools-extra/clangd/tool/ClangdMain.cpp
@@ -7,48 +7,31 @@
 //
 //===--===//
 
+#include 
+#include 
+#include 
+#include 
+#include 
 #include "ClangdLSPServer.h"
 #include "JSONRPCDispatcher.h"
 #include "Path.h"
 #include "Trace.h"
-#include "index/SymbolYAML.h"
 #include "clang/Basic/Version.h"
+#include "index/SymbolYAML.h"
+#include "index/dex/DexIndex.h"
 #include "llvm/Support/CommandLine.h"
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/Path.h"
 #include "llvm/Support/Program.h"
 #include "llvm/Support/Signals.h"
 #include "llvm/Support/raw_ostream.h"
-#include 
-#include 
-#include 
-#include 
-#include 
 
 using namespace clang;
 using namespace clang::clangd;
 
 namespace {
 enum class PCHStorageFlag { Disk, Memory };
 
-// Build an in-memory static index for global symbols from a YAML-format file.
-// The size of global symbols should be relatively small, so that all symbols
-// can be managed in memory.
-std::unique_ptr buildStaticIndex(llvm::StringRef YamlSymbolFile) {
-  auto Buffer = llvm::MemoryBuffer::getFile(YamlSymbolFile);
-  if (!Buffer) {
-llvm::errs() << "Can't open " << YamlSymbolFile << "\n";
-return nullptr;
-  }
-  auto Slab = symbolsFromYAML(Buffer.get()->getBuffer());
-  SymbolSlab::Builder SymsBuilder;
-  for (auto Sym : Slab)
-SymsBuilder.insert(Sym);
-
-  return MemIndex::build(std::move(SymsBuilder).build());
-}
-} // namespace
-
 static llvm::cl::opt CompileCommandsDir(
 "compile-commands-dir",
 llvm::cl::desc("Specify a path to look for compile_commands.json. If path "
@@ -185,6 +168,29 @@
 "'compile_commands.json' files")),
 llvm::cl::init(FilesystemCompileArgs), llvm::cl::Hidden);
 
+static llvm::cl::opt UseDex(
+"use-dex-index", llvm::cl::desc("Use experimental Dex static index."),
+llvm::cl::init(false), llvm::cl::Hidden);
+
+// Build an in-memory static index for global symbols from a YAML-format file.
+// The size of global symbols should be relatively small, so that all symbols
+// can be managed in memory.
+std::unique_ptr buildStaticIndex(llvm::StringRef YamlSymbolFile) {
+  auto Buffer = llvm::MemoryBuffer::getFile(YamlSymbolFile);
+  if (!Buffer) {
+llvm::errs() << "Can't open " << YamlSymbolFile << "\n";
+return nullptr;
+  }
+  auto Slab = symbolsFromYAML(Buffer.get()->getBuffer());
+  SymbolSlab::Builder SymsBuilder;
+  for (auto Sym : Slab) SymsBuilder.insert(Sym);
+
+  return UseDex
+ ? SymbolIndex::build(std::move(SymsBuilder).build())
+ : SymbolIndex::build(std::move(SymsBuilder).build());
+}
+}  // namespace
+
 int main(int argc, char *argv[]) {
   llvm::sys::PrintStackTraceOnErrorSignal(argv[0]);
   llvm::cl::SetVersionPrinter([](llvm::raw_ostream &OS) {
Index: clang-tools-extra/clangd/index/MemIndex.h
===
--- clang-tools-extra/clangd/index/MemIndex.h
+++ clang-tools-extra/clangd/index/MemIndex.h
@@ -24,9 +24,6 @@
   /// accessible as long as `Symbols` is kept alive.
   void build(std::shared_ptr> Symbols);
 
-  /// \brief Build index from 

[PATCH] D50897: [clangd] Allow using experimental Dex index

2018-08-17 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 161239.
kbobyrev added a comment.

Fix anonymous namespace beginning placement in Clangd driver.


https://reviews.llvm.org/D50897

Files:
  clang-tools-extra/clangd/index/Index.h
  clang-tools-extra/clangd/index/MemIndex.cpp
  clang-tools-extra/clangd/index/MemIndex.h
  clang-tools-extra/clangd/tool/ClangdMain.cpp
  clang-tools-extra/unittests/clangd/CodeCompleteTests.cpp
  clang-tools-extra/unittests/clangd/TestTU.cpp

Index: clang-tools-extra/unittests/clangd/TestTU.cpp
===
--- clang-tools-extra/unittests/clangd/TestTU.cpp
+++ clang-tools-extra/unittests/clangd/TestTU.cpp
@@ -49,7 +49,7 @@
 }
 
 std::unique_ptr TestTU::index() const {
-  return MemIndex::build(headerSymbols());
+  return SymbolIndex::build(headerSymbols());
 }
 
 const Symbol &findSymbol(const SymbolSlab &Slab, llvm::StringRef QName) {
Index: clang-tools-extra/unittests/clangd/CodeCompleteTests.cpp
===
--- clang-tools-extra/unittests/clangd/CodeCompleteTests.cpp
+++ clang-tools-extra/unittests/clangd/CodeCompleteTests.cpp
@@ -76,7 +76,7 @@
   SymbolSlab::Builder Slab;
   for (const auto &Sym : Symbols)
 Slab.insert(Sym);
-  return MemIndex::build(std::move(Slab).build());
+  return SymbolIndex::build(std::move(Slab).build());
 }
 
 CodeCompleteResult completions(ClangdServer &Server, StringRef TestCode,
Index: clang-tools-extra/clangd/tool/ClangdMain.cpp
===
--- clang-tools-extra/clangd/tool/ClangdMain.cpp
+++ clang-tools-extra/clangd/tool/ClangdMain.cpp
@@ -7,48 +7,28 @@
 //
 //===--===//
 
+#include 
+#include 
+#include 
+#include 
+#include 
 #include "ClangdLSPServer.h"
 #include "JSONRPCDispatcher.h"
 #include "Path.h"
 #include "Trace.h"
-#include "index/SymbolYAML.h"
 #include "clang/Basic/Version.h"
+#include "index/SymbolYAML.h"
+#include "index/dex/DexIndex.h"
 #include "llvm/Support/CommandLine.h"
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/Path.h"
 #include "llvm/Support/Program.h"
 #include "llvm/Support/Signals.h"
 #include "llvm/Support/raw_ostream.h"
-#include 
-#include 
-#include 
-#include 
-#include 
 
 using namespace clang;
 using namespace clang::clangd;
 
-namespace {
-enum class PCHStorageFlag { Disk, Memory };
-
-// Build an in-memory static index for global symbols from a YAML-format file.
-// The size of global symbols should be relatively small, so that all symbols
-// can be managed in memory.
-std::unique_ptr buildStaticIndex(llvm::StringRef YamlSymbolFile) {
-  auto Buffer = llvm::MemoryBuffer::getFile(YamlSymbolFile);
-  if (!Buffer) {
-llvm::errs() << "Can't open " << YamlSymbolFile << "\n";
-return nullptr;
-  }
-  auto Slab = symbolsFromYAML(Buffer.get()->getBuffer());
-  SymbolSlab::Builder SymsBuilder;
-  for (auto Sym : Slab)
-SymsBuilder.insert(Sym);
-
-  return MemIndex::build(std::move(SymsBuilder).build());
-}
-} // namespace
-
 static llvm::cl::opt CompileCommandsDir(
 "compile-commands-dir",
 llvm::cl::desc("Specify a path to look for compile_commands.json. If path "
@@ -113,6 +93,31 @@
 "Intended to simplify lit tests."),
 llvm::cl::init(false), llvm::cl::Hidden);
 
+
+namespace {
+
+enum class PCHStorageFlag { Disk, Memory };
+
+// Build an in-memory static index for global symbols from a YAML-format file.
+// The size of global symbols should be relatively small, so that all symbols
+// can be managed in memory.
+std::unique_ptr buildStaticIndex(llvm::StringRef YamlSymbolFile) {
+  auto Buffer = llvm::MemoryBuffer::getFile(YamlSymbolFile);
+  if (!Buffer) {
+llvm::errs() << "Can't open " << YamlSymbolFile << "\n";
+return nullptr;
+  }
+  auto Slab = symbolsFromYAML(Buffer.get()->getBuffer());
+  SymbolSlab::Builder SymsBuilder;
+  for (auto Sym : Slab) SymsBuilder.insert(Sym);
+
+  return UseDex
+ ? SymbolIndex::build(std::move(SymsBuilder).build())
+ : SymbolIndex::build(std::move(SymsBuilder).build());
+}
+
+}  // namespace
+
 static llvm::cl::opt PCHStorage(
 "pch-storage",
 llvm::cl::desc("Storing PCHs in memory increases memory usages, but may "
@@ -185,6 +190,10 @@
 "'compile_commands.json' files")),
 llvm::cl::init(FilesystemCompileArgs), llvm::cl::Hidden);
 
+static llvm::cl::opt UseDex(
+"use-dex-index", llvm::cl::desc("Use experimental Dex static index."),
+llvm::cl::init(false), llvm::cl::Hidden);
+
 int main(int argc, char *argv[]) {
   llvm::sys::PrintStackTraceOnErrorSignal(argv[0]);
   llvm::cl::SetVersionPrinter([](llvm::raw_ostream &OS) {
Index: clang-tools-extra/clangd/index/MemIndex.h
===
--- clang-tools-extra/clangd/index/MemIndex.h
+++ clang-tools-extra/clangd/index/MemIndex.

[PATCH] D50337: [clangd] DexIndex implementation prototype

2018-08-17 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 161252.
kbobyrev marked 9 inline comments as done.
kbobyrev added a comment.

Address another round of comments.


https://reviews.llvm.org/D50337

Files:
  clang-tools-extra/clangd/CMakeLists.txt
  clang-tools-extra/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/clangd/index/dex/DexIndex.h
  clang-tools-extra/clangd/index/dex/Token.h
  clang-tools-extra/unittests/clangd/CMakeLists.txt
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp
  clang-tools-extra/unittests/clangd/IndexTests.cpp
  clang-tools-extra/unittests/clangd/TestIndex.cpp
  clang-tools-extra/unittests/clangd/TestIndex.h

Index: clang-tools-extra/unittests/clangd/TestIndex.h
===
--- /dev/null
+++ clang-tools-extra/unittests/clangd/TestIndex.h
@@ -0,0 +1,57 @@
+//===-- IndexHelpers.h --*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#ifndef LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_INDEXTESTCOMMON_H
+#define LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_INDEXTESTCOMMON_H
+
+#include "index/Index.h"
+#include "index/Merge.h"
+#include "index/dex/DexIndex.h"
+#include "index/dex/Iterator.h"
+#include "index/dex/Token.h"
+#include "index/dex/Trigram.h"
+
+namespace clang {
+namespace clangd {
+
+Symbol symbol(llvm::StringRef QName);
+
+struct SlabAndPointers {
+  SymbolSlab Slab;
+  std::vector Pointers;
+};
+
+// Create a slab of symbols with the given qualified names as both IDs and
+// names. The life time of the slab is managed by the returned shared pointer.
+// If \p WeakSymbols is provided, it will be pointed to the managed object in
+// the returned shared pointer.
+std::shared_ptr>
+generateSymbols(std::vector QualifiedNames,
+std::weak_ptr *WeakSymbols = nullptr);
+
+// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
+// to the `generateSymbols` above.
+std::shared_ptr>
+generateNumSymbols(int Begin, int End,
+   std::weak_ptr *WeakSymbols = nullptr);
+
+std::string getQualifiedName(const Symbol &Sym);
+
+std::vector match(const SymbolIndex &I,
+   const FuzzyFindRequest &Req,
+   bool *Incomplete = nullptr);
+
+// Returns qualified names of symbols with any of IDs in the index.
+std::vector lookup(const SymbolIndex &I,
+llvm::ArrayRef IDs);
+
+} // namespace clangd
+} // namespace clang
+
+#endif
Index: clang-tools-extra/unittests/clangd/TestIndex.cpp
===
--- /dev/null
+++ clang-tools-extra/unittests/clangd/TestIndex.cpp
@@ -0,0 +1,89 @@
+//===-- IndexHelpers.cpp *- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "TestIndex.h"
+
+namespace clang {
+namespace clangd {
+
+Symbol symbol(llvm::StringRef QName) {
+  Symbol Sym;
+  Sym.ID = SymbolID(QName.str());
+  size_t Pos = QName.rfind("::");
+  if (Pos == llvm::StringRef::npos) {
+Sym.Name = QName;
+Sym.Scope = "";
+  } else {
+Sym.Name = QName.substr(Pos + 2);
+Sym.Scope = QName.substr(0, Pos + 2);
+  }
+  return Sym;
+}
+
+// Create a slab of symbols with the given qualified names as both IDs and
+// names. The life time of the slab is managed by the returned shared pointer.
+// If \p WeakSymbols is provided, it will be pointed to the managed object in
+// the returned shared pointer.
+std::shared_ptr>
+generateSymbols(std::vector QualifiedNames,
+std::weak_ptr *WeakSymbols) {
+  SymbolSlab::Builder Slab;
+  for (llvm::StringRef QName : QualifiedNames)
+Slab.insert(symbol(QName));
+
+  auto Storage = std::make_shared();
+  Storage->Slab = std::move(Slab).build();
+  for (const auto &Sym : Storage->Slab)
+Storage->Pointers.push_back(&Sym);
+  if (WeakSymbols)
+*WeakSymbols = Storage;
+  auto *Pointers = &Storage->Pointers;
+  return {std::move(Storage), Pointers};
+}
+
+// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
+// to the `generateSymbols` above.
+std::shared_ptr>
+generateNumSymbols(int Begin, int End,
+   std::weak_ptr *WeakSymbols) {
+  std::vector Names;
+  for (int i = Begin; i <= End; i++)
+Names.push_back(std::to_string(i));
+  return generateSymbols(Names, WeakSymbols);
+}
+
+std::string getQualifiedName(const Symbol &Sym) {
+  return (Sym.Scope + Sym.Name).str();
+}
+
+std::vector match(const SymbolIndex &I,
+

[PATCH] D50337: [clangd] DexIndex implementation prototype

2018-08-17 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 161273.
kbobyrev marked 3 inline comments as done.
kbobyrev added a comment.

Address all the comment, except the one about True iterators.


https://reviews.llvm.org/D50337

Files:
  clang-tools-extra/clangd/CMakeLists.txt
  clang-tools-extra/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/clangd/index/dex/DexIndex.h
  clang-tools-extra/clangd/index/dex/Token.h
  clang-tools-extra/unittests/clangd/CMakeLists.txt
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp
  clang-tools-extra/unittests/clangd/IndexTests.cpp
  clang-tools-extra/unittests/clangd/TestIndex.cpp
  clang-tools-extra/unittests/clangd/TestIndex.h

Index: clang-tools-extra/unittests/clangd/TestIndex.h
===
--- /dev/null
+++ clang-tools-extra/unittests/clangd/TestIndex.h
@@ -0,0 +1,57 @@
+//===-- IndexHelpers.h --*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#ifndef LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_INDEXTESTCOMMON_H
+#define LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_INDEXTESTCOMMON_H
+
+#include "index/Index.h"
+#include "index/Merge.h"
+#include "index/dex/DexIndex.h"
+#include "index/dex/Iterator.h"
+#include "index/dex/Token.h"
+#include "index/dex/Trigram.h"
+
+namespace clang {
+namespace clangd {
+
+Symbol symbol(llvm::StringRef QName);
+
+struct SlabAndPointers {
+  SymbolSlab Slab;
+  std::vector Pointers;
+};
+
+// Create a slab of symbols with the given qualified names as both IDs and
+// names. The life time of the slab is managed by the returned shared pointer.
+// If \p WeakSymbols is provided, it will be pointed to the managed object in
+// the returned shared pointer.
+std::shared_ptr>
+generateSymbols(std::vector QualifiedNames,
+std::weak_ptr *WeakSymbols = nullptr);
+
+// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
+// to the `generateSymbols` above.
+std::shared_ptr>
+generateNumSymbols(int Begin, int End,
+   std::weak_ptr *WeakSymbols = nullptr);
+
+std::string getQualifiedName(const Symbol &Sym);
+
+std::vector match(const SymbolIndex &I,
+   const FuzzyFindRequest &Req,
+   bool *Incomplete = nullptr);
+
+// Returns qualified names of symbols with any of IDs in the index.
+std::vector lookup(const SymbolIndex &I,
+llvm::ArrayRef IDs);
+
+} // namespace clangd
+} // namespace clang
+
+#endif
Index: clang-tools-extra/unittests/clangd/TestIndex.cpp
===
--- /dev/null
+++ clang-tools-extra/unittests/clangd/TestIndex.cpp
@@ -0,0 +1,89 @@
+//===-- IndexHelpers.cpp *- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "TestIndex.h"
+
+namespace clang {
+namespace clangd {
+
+Symbol symbol(llvm::StringRef QName) {
+  Symbol Sym;
+  Sym.ID = SymbolID(QName.str());
+  size_t Pos = QName.rfind("::");
+  if (Pos == llvm::StringRef::npos) {
+Sym.Name = QName;
+Sym.Scope = "";
+  } else {
+Sym.Name = QName.substr(Pos + 2);
+Sym.Scope = QName.substr(0, Pos + 2);
+  }
+  return Sym;
+}
+
+// Create a slab of symbols with the given qualified names as both IDs and
+// names. The life time of the slab is managed by the returned shared pointer.
+// If \p WeakSymbols is provided, it will be pointed to the managed object in
+// the returned shared pointer.
+std::shared_ptr>
+generateSymbols(std::vector QualifiedNames,
+std::weak_ptr *WeakSymbols) {
+  SymbolSlab::Builder Slab;
+  for (llvm::StringRef QName : QualifiedNames)
+Slab.insert(symbol(QName));
+
+  auto Storage = std::make_shared();
+  Storage->Slab = std::move(Slab).build();
+  for (const auto &Sym : Storage->Slab)
+Storage->Pointers.push_back(&Sym);
+  if (WeakSymbols)
+*WeakSymbols = Storage;
+  auto *Pointers = &Storage->Pointers;
+  return {std::move(Storage), Pointers};
+}
+
+// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
+// to the `generateSymbols` above.
+std::shared_ptr>
+generateNumSymbols(int Begin, int End,
+   std::weak_ptr *WeakSymbols) {
+  std::vector Names;
+  for (int i = Begin; i <= End; i++)
+Names.push_back(std::to_string(i));
+  return generateSymbols(Names, WeakSymbols);
+}
+
+std::string getQualifiedName(const Symbol &Sym) {
+  return (Sym.Scope + Sym.Name).str();
+}
+
+std::vector match(

[PATCH] D50337: [clangd] DexIndex implementation prototype

2018-08-17 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev planned changes to this revision.
kbobyrev added a comment.

I should create another patch with True iterator to address the last comment.




Comment at: clang-tools-extra/clangd/index/dex/DexIndex.cpp:97
+// Add OR iterator for scopes if the request contains scopes.
+if (!Req.Scopes.empty()) {
+  TopLevelChildren.push_back(createScopeIterator(Req.Scopes));

ioeric wrote:
> I think we should let `createScopeIterator` handle empty scope list case; it 
> can return an empty list anyway.
Yes, but it returns an iterator now and `OrIterator` (just like any other 
iterator) has to have non-empty list of children.


https://reviews.llvm.org/D50337



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D50703: [clangd] NFC: Mark Workspace Symbol feature complete in the documentation

2018-08-18 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev added a comment.

In https://reviews.llvm.org/D50703#1205049, @malaperle wrote:

> I hadn't marked it as done because without symbols in main files I found it 
> quite lacking.


Ah, I see, thank you for spotting it! I was under the impression that it's 
feature-complete. Should I mark it "Partial" until this is fixed then?


Repository:
  rL LLVM

https://reviews.llvm.org/D50703



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D50955: [clangd] Implement TRUE Iterator

2018-08-20 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev created this revision.
kbobyrev added a reviewer: ioeric.
Herald added subscribers: kadircet, arphaman, jkorous, MaskRay.

This patch introduces TRUE Iterator which efficiently handles posting lists 
containing all items within `[0, Size)` range.


https://reviews.llvm.org/D50955

Files:
  clang-tools-extra/clangd/index/dex/Iterator.cpp
  clang-tools-extra/clangd/index/dex/Iterator.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp


Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -262,6 +262,19 @@
   EXPECT_THAT(consume(*DocIterator, 0), ElementsAre());
 }
 
+TEST(DexIndexIterators, True) {
+  auto TrueIterator = createTrue(0U);
+  EXPECT_THAT(TrueIterator->reachedEnd(), true);
+  EXPECT_THAT(consume(*TrueIterator), ElementsAre());
+
+  PostingList L0 = {1, 2, 5, 7};
+  TrueIterator = createTrue(7U);
+  EXPECT_THAT(TrueIterator->peek(), 0);
+  auto AndIterator = createAnd(create(L0), move(TrueIterator));
+  EXPECT_THAT(AndIterator->reachedEnd(), false);
+  EXPECT_THAT(consume(*AndIterator), ElementsAre(1, 2, 5));
+}
+
 testing::Matcher>
 trigramsAre(std::initializer_list Trigrams) {
   std::vector Tokens;
Index: clang-tools-extra/clangd/index/dex/Iterator.h
===
--- clang-tools-extra/clangd/index/dex/Iterator.h
+++ clang-tools-extra/clangd/index/dex/Iterator.h
@@ -129,6 +129,10 @@
 std::unique_ptr
 createOr(std::vector> Children);
 
+/// Returns TRUE Iterator which iterates over "virtual" PostingList containing
+/// all items in range [0, Size) in an efficient manner.
+std::unique_ptr createTrue(DocID Size);
+
 /// This allows createAnd(create(...), create(...)) syntax.
 template  std::unique_ptr createAnd(Args... args) {
   std::vector> Children;
Index: clang-tools-extra/clangd/index/dex/Iterator.cpp
===
--- clang-tools-extra/clangd/index/dex/Iterator.cpp
+++ clang-tools-extra/clangd/index/dex/Iterator.cpp
@@ -225,6 +225,45 @@
   std::vector> Children;
 };
 
+/// TrueIterator handles PostingLists which contain all items of the index. In
+/// order to prevent additional memory consumption, it only stores the size of
+/// this virtual posting list because posting lists of such density are likely
+/// to consume a lot of memory. All operations are performed in O(1) as a
+/// result which also significantly improves performance of the iterator,
+/// because iterating such posting list via DocumentIterator would result in
+/// increased cost of advanceTo().
+class TrueIterator : public Iterator {
+public:
+  TrueIterator(DocID Size) : Size(Size) {}
+
+  bool reachedEnd() const override { return Index >= Size; }
+
+  void advance() override {
+assert(!reachedEnd() && "Can't advance iterator after it reached the 
end.");
+++Index;
+  }
+
+  void advanceTo(DocID ID) override {
+assert(!reachedEnd() && "Can't advance iterator after it reached the 
end.");
+Index = ID;
+  }
+
+  DocID peek() const override {
+assert(!reachedEnd() && "TrueIterator can't call peek() at the end.");
+return Index;
+  }
+
+private:
+  llvm::raw_ostream &dump(llvm::raw_ostream &OS) const override {
+OS << "(TrueIterator {" << Index << "} out of " << Size << ")";
+return OS;
+  }
+
+  DocID Index = 0;
+  /// Size of the underlying virtual PostingList.
+  DocID Size;
+};
+
 } // end namespace
 
 std::vector consume(Iterator &It, size_t Limit) {
@@ -249,6 +288,10 @@
   return llvm::make_unique(move(Children));
 }
 
+std::unique_ptr createTrue(DocID Size) {
+  return llvm::make_unique(Size);
+}
+
 } // namespace dex
 } // namespace clangd
 } // namespace clang


Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -262,6 +262,19 @@
   EXPECT_THAT(consume(*DocIterator, 0), ElementsAre());
 }
 
+TEST(DexIndexIterators, True) {
+  auto TrueIterator = createTrue(0U);
+  EXPECT_THAT(TrueIterator->reachedEnd(), true);
+  EXPECT_THAT(consume(*TrueIterator), ElementsAre());
+
+  PostingList L0 = {1, 2, 5, 7};
+  TrueIterator = createTrue(7U);
+  EXPECT_THAT(TrueIterator->peek(), 0);
+  auto AndIterator = createAnd(create(L0), move(TrueIterator));
+  EXPECT_THAT(AndIterator->reachedEnd(), false);
+  EXPECT_THAT(consume(*AndIterator), ElementsAre(1, 2, 5));
+}
+
 testing::Matcher>
 trigramsAre(std::initializer_list Trigrams) {
   std::vector Tokens;
Index: clang-tools-extra/clangd/index/dex/Iterator.h
===
--- clang-tools-extra/clangd/index/dex/Iterator.h
+++ clang-tools-extra/clangd/index/dex/Iterator.h
@@ -129,6 +129,10 @@
 std::uniqu

[PATCH] D50956: [clangd] NFC: Cleanup Dex Iterator comments and simplify tests

2018-08-20 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev created this revision.
kbobyrev added a reviewer: ioeric.
Herald added subscribers: kadircet, arphaman, omtcyfz, jkorous, MaskRay, 
ilya-biryukov.

Proposed changes:

- Cleanup comments in `clangd/index/dex/Iterator.h`: Vim's `gq` formatting 
added redundant spaces instead of newlines in few places
- Few comments in `OrIterator` are wrong
- Use `EXPECT_TRUE(Condition)` instead of `EXPECT_THAT(Condition, true)` (same 
with `EXPECT_FALSE`)

This patch does not affect functionality.


https://reviews.llvm.org/D50956

Files:
  clang-tools-extra/clangd/index/dex/Iterator.cpp
  clang-tools-extra/clangd/index/dex/Iterator.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp

Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -28,33 +28,33 @@
   auto DocIterator = create(L);
 
   EXPECT_EQ(DocIterator->peek(), 4U);
-  EXPECT_EQ(DocIterator->reachedEnd(), false);
+  EXPECT_FALSE(DocIterator->reachedEnd());
 
   DocIterator->advance();
   EXPECT_EQ(DocIterator->peek(), 7U);
-  EXPECT_EQ(DocIterator->reachedEnd(), false);
+  EXPECT_FALSE(DocIterator->reachedEnd());
 
   DocIterator->advanceTo(20);
   EXPECT_EQ(DocIterator->peek(), 20U);
-  EXPECT_EQ(DocIterator->reachedEnd(), false);
+  EXPECT_FALSE(DocIterator->reachedEnd());
 
   DocIterator->advanceTo(65);
   EXPECT_EQ(DocIterator->peek(), 100U);
-  EXPECT_EQ(DocIterator->reachedEnd(), false);
+  EXPECT_FALSE(DocIterator->reachedEnd());
 
   DocIterator->advanceTo(420);
-  EXPECT_EQ(DocIterator->reachedEnd(), true);
+  EXPECT_TRUE(DocIterator->reachedEnd());
 }
 
 TEST(DexIndexIterators, AndWithEmpty) {
   const PostingList L0;
   const PostingList L1 = {0, 5, 7, 10, 42, 320, 9000};
 
   auto AndEmpty = createAnd(create(L0));
-  EXPECT_EQ(AndEmpty->reachedEnd(), true);
+  EXPECT_TRUE(AndEmpty->reachedEnd());
 
   auto AndWithEmpty = createAnd(create(L0), create(L1));
-  EXPECT_EQ(AndWithEmpty->reachedEnd(), true);
+  EXPECT_TRUE(AndWithEmpty->reachedEnd());
 
   EXPECT_THAT(consume(*AndWithEmpty), ElementsAre());
 }
@@ -65,7 +65,7 @@
 
   auto And = createAnd(create(L1), create(L0));
 
-  EXPECT_EQ(And->reachedEnd(), false);
+  EXPECT_FALSE(And->reachedEnd());
   EXPECT_THAT(consume(*And), ElementsAre(0U, 7U, 10U, 320U, 9000U));
 
   And = createAnd(create(L0), create(L1));
@@ -94,18 +94,18 @@
   EXPECT_EQ(And->peek(), 320U);
   And->advanceTo(10);
 
-  EXPECT_EQ(And->reachedEnd(), true);
+  EXPECT_TRUE(And->reachedEnd());
 }
 
 TEST(DexIndexIterators, OrWithEmpty) {
   const PostingList L0;
   const PostingList L1 = {0, 5, 7, 10, 42, 320, 9000};
 
   auto OrEmpty = createOr(create(L0));
-  EXPECT_EQ(OrEmpty->reachedEnd(), true);
+  EXPECT_TRUE(OrEmpty->reachedEnd());
 
   auto OrWithEmpty = createOr(create(L0), create(L1));
-  EXPECT_EQ(OrWithEmpty->reachedEnd(), false);
+  EXPECT_FALSE(OrWithEmpty->reachedEnd());
 
   EXPECT_THAT(consume(*OrWithEmpty),
   ElementsAre(0U, 5U, 7U, 10U, 42U, 320U, 9000U));
@@ -117,7 +117,7 @@
 
   auto Or = createOr(create(L0), create(L1));
 
-  EXPECT_EQ(Or->reachedEnd(), false);
+  EXPECT_FALSE(Or->reachedEnd());
   EXPECT_EQ(Or->peek(), 0U);
   Or->advance();
   EXPECT_EQ(Or->peek(), 4U);
@@ -136,7 +136,7 @@
   Or->advanceTo(9000);
   EXPECT_EQ(Or->peek(), 9000U);
   Or->advanceTo(9001);
-  EXPECT_EQ(Or->reachedEnd(), true);
+  EXPECT_TRUE(Or->reachedEnd());
 
   Or = createOr(create(L0), create(L1));
 
@@ -151,7 +151,7 @@
 
   auto Or = createOr(create(L0), create(L1), create(L2));
 
-  EXPECT_EQ(Or->reachedEnd(), false);
+  EXPECT_FALSE(Or->reachedEnd());
   EXPECT_EQ(Or->peek(), 0U);
 
   Or->advance();
@@ -166,7 +166,7 @@
   EXPECT_EQ(Or->peek(), 60U);
 
   Or->advanceTo(9001);
-  EXPECT_EQ(Or->reachedEnd(), true);
+  EXPECT_TRUE(Or->reachedEnd());
 }
 
 // FIXME(kbobyrev): The testcase below is similar to what is expected in real
@@ -208,7 +208,7 @@
   // Lower Or Iterator: [0, 1, 5]
   createOr(create(L2), create(L3), create(L4)));
 
-  EXPECT_EQ(Root->reachedEnd(), false);
+  EXPECT_FALSE(Root->reachedEnd());
   EXPECT_EQ(Root->peek(), 1U);
   Root->advanceTo(0);
   // Advance multiple times. Shouldn't do anything.
@@ -220,7 +220,7 @@
   Root->advanceTo(5);
   EXPECT_EQ(Root->peek(), 5U);
   Root->advanceTo(9000);
-  EXPECT_EQ(Root->reachedEnd(), true);
+  EXPECT_TRUE(Root->reachedEnd());
 }
 
 TEST(DexIndexIterators, StringRepresentation) {
@@ -264,14 +264,14 @@
 
 TEST(DexIndexIterators, True) {
   auto TrueIterator = createTrue(0U);
-  EXPECT_THAT(TrueIterator->reachedEnd(), true);
+  EXPECT_TRUE(TrueIterator->reachedEnd());
   EXPECT_THAT(consume(*TrueIterator), ElementsAre());
 
   PostingList L0 = {1, 2, 5, 7};
   TrueIterator = createTrue(7U);
   EXPECT_THAT(TrueIterator->peek(), 0);
   auto AndIterator = createAnd(create(L0), move(TrueIterator));
-  EXPECT_THAT(AndIterator->reachedEnd(),

[PATCH] D50337: [clangd] DexIndex implementation prototype

2018-08-20 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 161432.
kbobyrev marked 2 inline comments as done.
kbobyrev added a comment.
Herald added a subscriber: kadircet.

Use TRUE iterator to ensure validity of the query processing.


https://reviews.llvm.org/D50337

Files:
  clang-tools-extra/clangd/CMakeLists.txt
  clang-tools-extra/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/clangd/index/dex/DexIndex.h
  clang-tools-extra/clangd/index/dex/Token.h
  clang-tools-extra/unittests/clangd/CMakeLists.txt
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp
  clang-tools-extra/unittests/clangd/IndexTests.cpp
  clang-tools-extra/unittests/clangd/TestIndex.cpp
  clang-tools-extra/unittests/clangd/TestIndex.h

Index: clang-tools-extra/unittests/clangd/TestIndex.h
===
--- /dev/null
+++ clang-tools-extra/unittests/clangd/TestIndex.h
@@ -0,0 +1,57 @@
+//===-- IndexHelpers.h --*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#ifndef LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_INDEXTESTCOMMON_H
+#define LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_INDEXTESTCOMMON_H
+
+#include "index/Index.h"
+#include "index/Merge.h"
+#include "index/dex/DexIndex.h"
+#include "index/dex/Iterator.h"
+#include "index/dex/Token.h"
+#include "index/dex/Trigram.h"
+
+namespace clang {
+namespace clangd {
+
+Symbol symbol(llvm::StringRef QName);
+
+struct SlabAndPointers {
+  SymbolSlab Slab;
+  std::vector Pointers;
+};
+
+// Create a slab of symbols with the given qualified names as both IDs and
+// names. The life time of the slab is managed by the returned shared pointer.
+// If \p WeakSymbols is provided, it will be pointed to the managed object in
+// the returned shared pointer.
+std::shared_ptr>
+generateSymbols(std::vector QualifiedNames,
+std::weak_ptr *WeakSymbols = nullptr);
+
+// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
+// to the `generateSymbols` above.
+std::shared_ptr>
+generateNumSymbols(int Begin, int End,
+   std::weak_ptr *WeakSymbols = nullptr);
+
+std::string getQualifiedName(const Symbol &Sym);
+
+std::vector match(const SymbolIndex &I,
+   const FuzzyFindRequest &Req,
+   bool *Incomplete = nullptr);
+
+// Returns qualified names of symbols with any of IDs in the index.
+std::vector lookup(const SymbolIndex &I,
+llvm::ArrayRef IDs);
+
+} // namespace clangd
+} // namespace clang
+
+#endif
Index: clang-tools-extra/unittests/clangd/TestIndex.cpp
===
--- /dev/null
+++ clang-tools-extra/unittests/clangd/TestIndex.cpp
@@ -0,0 +1,89 @@
+//===-- IndexHelpers.cpp *- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "TestIndex.h"
+
+namespace clang {
+namespace clangd {
+
+Symbol symbol(llvm::StringRef QName) {
+  Symbol Sym;
+  Sym.ID = SymbolID(QName.str());
+  size_t Pos = QName.rfind("::");
+  if (Pos == llvm::StringRef::npos) {
+Sym.Name = QName;
+Sym.Scope = "";
+  } else {
+Sym.Name = QName.substr(Pos + 2);
+Sym.Scope = QName.substr(0, Pos + 2);
+  }
+  return Sym;
+}
+
+// Create a slab of symbols with the given qualified names as both IDs and
+// names. The life time of the slab is managed by the returned shared pointer.
+// If \p WeakSymbols is provided, it will be pointed to the managed object in
+// the returned shared pointer.
+std::shared_ptr>
+generateSymbols(std::vector QualifiedNames,
+std::weak_ptr *WeakSymbols) {
+  SymbolSlab::Builder Slab;
+  for (llvm::StringRef QName : QualifiedNames)
+Slab.insert(symbol(QName));
+
+  auto Storage = std::make_shared();
+  Storage->Slab = std::move(Slab).build();
+  for (const auto &Sym : Storage->Slab)
+Storage->Pointers.push_back(&Sym);
+  if (WeakSymbols)
+*WeakSymbols = Storage;
+  auto *Pointers = &Storage->Pointers;
+  return {std::move(Storage), Pointers};
+}
+
+// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
+// to the `generateSymbols` above.
+std::shared_ptr>
+generateNumSymbols(int Begin, int End,
+   std::weak_ptr *WeakSymbols) {
+  std::vector Names;
+  for (int i = Begin; i <= End; i++)
+Names.push_back(std::to_string(i));
+  return generateSymbols(Names, WeakSymbols);
+}
+
+std::string getQualifiedName(const Symbol &Sym) {
+  return (Sym.Scope + Sym.

[PATCH] D50955: [clangd] Implement TRUE Iterator

2018-08-20 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 161435.
kbobyrev marked 2 inline comments as done.
kbobyrev added a comment.

Address a couple of comments.


https://reviews.llvm.org/D50955

Files:
  clang-tools-extra/clangd/index/dex/Iterator.cpp
  clang-tools-extra/clangd/index/dex/Iterator.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp


Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -262,6 +262,19 @@
   EXPECT_THAT(consume(*DocIterator, 0), ElementsAre());
 }
 
+TEST(DexIndexIterators, True) {
+  auto TrueIterator = createTrue(0U);
+  EXPECT_THAT(TrueIterator->reachedEnd(), true);
+  EXPECT_THAT(consume(*TrueIterator), ElementsAre());
+
+  PostingList L0 = {1, 2, 5, 7};
+  TrueIterator = createTrue(7U);
+  EXPECT_THAT(TrueIterator->peek(), 0);
+  auto AndIterator = createAnd(create(L0), move(TrueIterator));
+  EXPECT_THAT(AndIterator->reachedEnd(), false);
+  EXPECT_THAT(consume(*AndIterator), ElementsAre(1, 2, 5));
+}
+
 testing::Matcher>
 trigramsAre(std::initializer_list Trigrams) {
   std::vector Tokens;
Index: clang-tools-extra/clangd/index/dex/Iterator.h
===
--- clang-tools-extra/clangd/index/dex/Iterator.h
+++ clang-tools-extra/clangd/index/dex/Iterator.h
@@ -129,6 +129,10 @@
 std::unique_ptr
 createOr(std::vector> Children);
 
+/// Returns TRUE Iterator which iterates over "virtual" PostingList containing
+/// all items in range [0, Size) in an efficient manner.
+std::unique_ptr createTrue(DocID Size);
+
 /// This allows createAnd(create(...), create(...)) syntax.
 template  std::unique_ptr createAnd(Args... args) {
   std::vector> Children;
Index: clang-tools-extra/clangd/index/dex/Iterator.cpp
===
--- clang-tools-extra/clangd/index/dex/Iterator.cpp
+++ clang-tools-extra/clangd/index/dex/Iterator.cpp
@@ -225,6 +225,41 @@
   std::vector> Children;
 };
 
+/// TrueIterator handles PostingLists which contain all items of the index. It
+/// stores size of the virtual posting list, and all operations are performed
+/// in O(1).
+class TrueIterator : public Iterator {
+public:
+  TrueIterator(DocID Size) : Size(Size) {}
+
+  bool reachedEnd() const override { return Index >= Size; }
+
+  void advance() override {
+assert(!reachedEnd() && "Can't advance iterator after it reached the 
end.");
+++Index;
+  }
+
+  void advanceTo(DocID ID) override {
+assert(!reachedEnd() && "Can't advance iterator after it reached the 
end.");
+Index = std::min(ID, Size);
+  }
+
+  DocID peek() const override {
+assert(!reachedEnd() && "TrueIterator can't call peek() at the end.");
+return Index;
+  }
+
+private:
+  llvm::raw_ostream &dump(llvm::raw_ostream &OS) const override {
+OS << "(TRUE {" << Index << "} out of " << Size << ")";
+return OS;
+  }
+
+  DocID Index = 0;
+  /// Size of the underlying virtual PostingList.
+  DocID Size;
+};
+
 } // end namespace
 
 std::vector consume(Iterator &It, size_t Limit) {
@@ -249,6 +284,10 @@
   return llvm::make_unique(move(Children));
 }
 
+std::unique_ptr createTrue(DocID Size) {
+  return llvm::make_unique(Size);
+}
+
 } // namespace dex
 } // namespace clangd
 } // namespace clang


Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -262,6 +262,19 @@
   EXPECT_THAT(consume(*DocIterator, 0), ElementsAre());
 }
 
+TEST(DexIndexIterators, True) {
+  auto TrueIterator = createTrue(0U);
+  EXPECT_THAT(TrueIterator->reachedEnd(), true);
+  EXPECT_THAT(consume(*TrueIterator), ElementsAre());
+
+  PostingList L0 = {1, 2, 5, 7};
+  TrueIterator = createTrue(7U);
+  EXPECT_THAT(TrueIterator->peek(), 0);
+  auto AndIterator = createAnd(create(L0), move(TrueIterator));
+  EXPECT_THAT(AndIterator->reachedEnd(), false);
+  EXPECT_THAT(consume(*AndIterator), ElementsAre(1, 2, 5));
+}
+
 testing::Matcher>
 trigramsAre(std::initializer_list Trigrams) {
   std::vector Tokens;
Index: clang-tools-extra/clangd/index/dex/Iterator.h
===
--- clang-tools-extra/clangd/index/dex/Iterator.h
+++ clang-tools-extra/clangd/index/dex/Iterator.h
@@ -129,6 +129,10 @@
 std::unique_ptr
 createOr(std::vector> Children);
 
+/// Returns TRUE Iterator which iterates over "virtual" PostingList containing
+/// all items in range [0, Size) in an efficient manner.
+std::unique_ptr createTrue(DocID Size);
+
 /// This allows createAnd(create(...), create(...)) syntax.
 template  std::unique_ptr createAnd(Args... args) {
   std::vector> Children;
Index: clang-tools-extra/clangd/index/dex/Iterator.cpp
==

[PATCH] D50955: [clangd] Implement TRUE Iterator

2018-08-20 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev added inline comments.



Comment at: clang-tools-extra/clangd/index/dex/Iterator.cpp:248
+assert(!reachedEnd() && "Can't advance iterator after it reached the 
end.");
+Index = ID;
+  }

ioeric wrote:
> Should we check `ID < Size` here?
Not really, here's an example: `(& ({0}, 42) (TRUE {0} out of 10)`.

When called `advance()`, underlying AND iterator would point to `42` and 
`advanceTo()` would be called on TRUE iterator, it will move it to the END but 
it would be completely valid (same behavior for every iterator, actually, since 
none of them check for `ID < LastDocID` equivalent).

I should, however, do `Index = std::min(ID, Size)` since calling `dump()` and 
getting something like `(TRUE {9000} out of 42)` would be implicit.


https://reviews.llvm.org/D50955



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D50955: [clangd] Implement TRUE Iterator

2018-08-20 Thread Kirill Bobyrev via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL340155: [clangd] Implement TRUE Iterator (authored by 
omtcyfz, committed by ).
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D50955?vs=161435&id=161440#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D50955

Files:
  clang-tools-extra/trunk/clangd/index/dex/Iterator.cpp
  clang-tools-extra/trunk/clangd/index/dex/Iterator.h
  clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp


Index: clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp
@@ -262,6 +262,19 @@
   EXPECT_THAT(consume(*DocIterator, 0), ElementsAre());
 }
 
+TEST(DexIndexIterators, True) {
+  auto TrueIterator = createTrue(0U);
+  EXPECT_THAT(TrueIterator->reachedEnd(), true);
+  EXPECT_THAT(consume(*TrueIterator), ElementsAre());
+
+  PostingList L0 = {1, 2, 5, 7};
+  TrueIterator = createTrue(7U);
+  EXPECT_THAT(TrueIterator->peek(), 0);
+  auto AndIterator = createAnd(create(L0), move(TrueIterator));
+  EXPECT_THAT(AndIterator->reachedEnd(), false);
+  EXPECT_THAT(consume(*AndIterator), ElementsAre(1, 2, 5));
+}
+
 testing::Matcher>
 trigramsAre(std::initializer_list Trigrams) {
   std::vector Tokens;
Index: clang-tools-extra/trunk/clangd/index/dex/Iterator.cpp
===
--- clang-tools-extra/trunk/clangd/index/dex/Iterator.cpp
+++ clang-tools-extra/trunk/clangd/index/dex/Iterator.cpp
@@ -225,6 +225,41 @@
   std::vector> Children;
 };
 
+/// TrueIterator handles PostingLists which contain all items of the index. It
+/// stores size of the virtual posting list, and all operations are performed
+/// in O(1).
+class TrueIterator : public Iterator {
+public:
+  TrueIterator(DocID Size) : Size(Size) {}
+
+  bool reachedEnd() const override { return Index >= Size; }
+
+  void advance() override {
+assert(!reachedEnd() && "Can't advance iterator after it reached the 
end.");
+++Index;
+  }
+
+  void advanceTo(DocID ID) override {
+assert(!reachedEnd() && "Can't advance iterator after it reached the 
end.");
+Index = std::min(ID, Size);
+  }
+
+  DocID peek() const override {
+assert(!reachedEnd() && "TrueIterator can't call peek() at the end.");
+return Index;
+  }
+
+private:
+  llvm::raw_ostream &dump(llvm::raw_ostream &OS) const override {
+OS << "(TRUE {" << Index << "} out of " << Size << ")";
+return OS;
+  }
+
+  DocID Index = 0;
+  /// Size of the underlying virtual PostingList.
+  DocID Size;
+};
+
 } // end namespace
 
 std::vector consume(Iterator &It, size_t Limit) {
@@ -249,6 +284,10 @@
   return llvm::make_unique(move(Children));
 }
 
+std::unique_ptr createTrue(DocID Size) {
+  return llvm::make_unique(Size);
+}
+
 } // namespace dex
 } // namespace clangd
 } // namespace clang
Index: clang-tools-extra/trunk/clangd/index/dex/Iterator.h
===
--- clang-tools-extra/trunk/clangd/index/dex/Iterator.h
+++ clang-tools-extra/trunk/clangd/index/dex/Iterator.h
@@ -129,6 +129,10 @@
 std::unique_ptr
 createOr(std::vector> Children);
 
+/// Returns TRUE Iterator which iterates over "virtual" PostingList containing
+/// all items in range [0, Size) in an efficient manner.
+std::unique_ptr createTrue(DocID Size);
+
 /// This allows createAnd(create(...), create(...)) syntax.
 template  std::unique_ptr createAnd(Args... args) {
   std::vector> Children;


Index: clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp
@@ -262,6 +262,19 @@
   EXPECT_THAT(consume(*DocIterator, 0), ElementsAre());
 }
 
+TEST(DexIndexIterators, True) {
+  auto TrueIterator = createTrue(0U);
+  EXPECT_THAT(TrueIterator->reachedEnd(), true);
+  EXPECT_THAT(consume(*TrueIterator), ElementsAre());
+
+  PostingList L0 = {1, 2, 5, 7};
+  TrueIterator = createTrue(7U);
+  EXPECT_THAT(TrueIterator->peek(), 0);
+  auto AndIterator = createAnd(create(L0), move(TrueIterator));
+  EXPECT_THAT(AndIterator->reachedEnd(), false);
+  EXPECT_THAT(consume(*AndIterator), ElementsAre(1, 2, 5));
+}
+
 testing::Matcher>
 trigramsAre(std::initializer_list Trigrams) {
   std::vector Tokens;
Index: clang-tools-extra/trunk/clangd/index/dex/Iterator.cpp
===
--- clang-tools-extra/trunk/clangd/index/dex/Iterator.cpp
+++ clang-tools-extra/trunk/clangd/index/dex/Iterator.cpp
@@ -225,6 +225,41 @@
   std::vector> Children;
 };
 
+/// TrueIterator handles PostingLists which contain all items of the index. It
+/// stores size of the virtual posting list

[PATCH] D50956: [clangd] NFC: Cleanup Dex Iterator comments and simplify tests

2018-08-20 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 161443.
kbobyrev edited the summary of this revision.
kbobyrev added a comment.

Something I initially forgot: fix misplaced `private:` so that `dump()` stays 
private as it was in the interface.


https://reviews.llvm.org/D50956

Files:
  clang-tools-extra/clangd/index/dex/Iterator.cpp
  clang-tools-extra/clangd/index/dex/Iterator.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp

Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -28,33 +28,33 @@
   auto DocIterator = create(L);
 
   EXPECT_EQ(DocIterator->peek(), 4U);
-  EXPECT_EQ(DocIterator->reachedEnd(), false);
+  EXPECT_FALSE(DocIterator->reachedEnd());
 
   DocIterator->advance();
   EXPECT_EQ(DocIterator->peek(), 7U);
-  EXPECT_EQ(DocIterator->reachedEnd(), false);
+  EXPECT_FALSE(DocIterator->reachedEnd());
 
   DocIterator->advanceTo(20);
   EXPECT_EQ(DocIterator->peek(), 20U);
-  EXPECT_EQ(DocIterator->reachedEnd(), false);
+  EXPECT_FALSE(DocIterator->reachedEnd());
 
   DocIterator->advanceTo(65);
   EXPECT_EQ(DocIterator->peek(), 100U);
-  EXPECT_EQ(DocIterator->reachedEnd(), false);
+  EXPECT_FALSE(DocIterator->reachedEnd());
 
   DocIterator->advanceTo(420);
-  EXPECT_EQ(DocIterator->reachedEnd(), true);
+  EXPECT_TRUE(DocIterator->reachedEnd());
 }
 
 TEST(DexIndexIterators, AndWithEmpty) {
   const PostingList L0;
   const PostingList L1 = {0, 5, 7, 10, 42, 320, 9000};
 
   auto AndEmpty = createAnd(create(L0));
-  EXPECT_EQ(AndEmpty->reachedEnd(), true);
+  EXPECT_TRUE(AndEmpty->reachedEnd());
 
   auto AndWithEmpty = createAnd(create(L0), create(L1));
-  EXPECT_EQ(AndWithEmpty->reachedEnd(), true);
+  EXPECT_TRUE(AndWithEmpty->reachedEnd());
 
   EXPECT_THAT(consume(*AndWithEmpty), ElementsAre());
 }
@@ -65,7 +65,7 @@
 
   auto And = createAnd(create(L1), create(L0));
 
-  EXPECT_EQ(And->reachedEnd(), false);
+  EXPECT_FALSE(And->reachedEnd());
   EXPECT_THAT(consume(*And), ElementsAre(0U, 7U, 10U, 320U, 9000U));
 
   And = createAnd(create(L0), create(L1));
@@ -94,18 +94,18 @@
   EXPECT_EQ(And->peek(), 320U);
   And->advanceTo(10);
 
-  EXPECT_EQ(And->reachedEnd(), true);
+  EXPECT_TRUE(And->reachedEnd());
 }
 
 TEST(DexIndexIterators, OrWithEmpty) {
   const PostingList L0;
   const PostingList L1 = {0, 5, 7, 10, 42, 320, 9000};
 
   auto OrEmpty = createOr(create(L0));
-  EXPECT_EQ(OrEmpty->reachedEnd(), true);
+  EXPECT_TRUE(OrEmpty->reachedEnd());
 
   auto OrWithEmpty = createOr(create(L0), create(L1));
-  EXPECT_EQ(OrWithEmpty->reachedEnd(), false);
+  EXPECT_FALSE(OrWithEmpty->reachedEnd());
 
   EXPECT_THAT(consume(*OrWithEmpty),
   ElementsAre(0U, 5U, 7U, 10U, 42U, 320U, 9000U));
@@ -117,7 +117,7 @@
 
   auto Or = createOr(create(L0), create(L1));
 
-  EXPECT_EQ(Or->reachedEnd(), false);
+  EXPECT_FALSE(Or->reachedEnd());
   EXPECT_EQ(Or->peek(), 0U);
   Or->advance();
   EXPECT_EQ(Or->peek(), 4U);
@@ -136,7 +136,7 @@
   Or->advanceTo(9000);
   EXPECT_EQ(Or->peek(), 9000U);
   Or->advanceTo(9001);
-  EXPECT_EQ(Or->reachedEnd(), true);
+  EXPECT_TRUE(Or->reachedEnd());
 
   Or = createOr(create(L0), create(L1));
 
@@ -151,7 +151,7 @@
 
   auto Or = createOr(create(L0), create(L1), create(L2));
 
-  EXPECT_EQ(Or->reachedEnd(), false);
+  EXPECT_FALSE(Or->reachedEnd());
   EXPECT_EQ(Or->peek(), 0U);
 
   Or->advance();
@@ -166,7 +166,7 @@
   EXPECT_EQ(Or->peek(), 60U);
 
   Or->advanceTo(9001);
-  EXPECT_EQ(Or->reachedEnd(), true);
+  EXPECT_TRUE(Or->reachedEnd());
 }
 
 // FIXME(kbobyrev): The testcase below is similar to what is expected in real
@@ -208,7 +208,7 @@
   // Lower Or Iterator: [0, 1, 5]
   createOr(create(L2), create(L3), create(L4)));
 
-  EXPECT_EQ(Root->reachedEnd(), false);
+  EXPECT_FALSE(Root->reachedEnd());
   EXPECT_EQ(Root->peek(), 1U);
   Root->advanceTo(0);
   // Advance multiple times. Shouldn't do anything.
@@ -220,7 +220,7 @@
   Root->advanceTo(5);
   EXPECT_EQ(Root->peek(), 5U);
   Root->advanceTo(9000);
-  EXPECT_EQ(Root->reachedEnd(), true);
+  EXPECT_TRUE(Root->reachedEnd());
 }
 
 TEST(DexIndexIterators, StringRepresentation) {
@@ -264,14 +264,14 @@
 
 TEST(DexIndexIterators, True) {
   auto TrueIterator = createTrue(0U);
-  EXPECT_THAT(TrueIterator->reachedEnd(), true);
+  EXPECT_TRUE(TrueIterator->reachedEnd());
   EXPECT_THAT(consume(*TrueIterator), ElementsAre());
 
   PostingList L0 = {1, 2, 5, 7};
   TrueIterator = createTrue(7U);
   EXPECT_THAT(TrueIterator->peek(), 0);
   auto AndIterator = createAnd(create(L0), move(TrueIterator));
-  EXPECT_THAT(AndIterator->reachedEnd(), false);
+  EXPECT_FALSE(AndIterator->reachedEnd());
   EXPECT_THAT(consume(*AndIterator), ElementsAre(1, 2, 5));
 }
 
Index: clang-tools-extra/clangd/index/dex/Iterator.h
===
--- clang-tools-ext

[PATCH] D50956: [clangd] NFC: Cleanup Dex Iterator comments and simplify tests

2018-08-20 Thread Kirill Bobyrev via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL340157: [clangd] NFC: Cleanup Dex Iterator comments and 
simplify tests (authored by omtcyfz, committed by ).
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D50956?vs=161443&id=161448#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D50956

Files:
  clang-tools-extra/trunk/clangd/index/dex/Iterator.cpp
  clang-tools-extra/trunk/clangd/index/dex/Iterator.h
  clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp

Index: clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp
@@ -28,33 +28,33 @@
   auto DocIterator = create(L);
 
   EXPECT_EQ(DocIterator->peek(), 4U);
-  EXPECT_EQ(DocIterator->reachedEnd(), false);
+  EXPECT_FALSE(DocIterator->reachedEnd());
 
   DocIterator->advance();
   EXPECT_EQ(DocIterator->peek(), 7U);
-  EXPECT_EQ(DocIterator->reachedEnd(), false);
+  EXPECT_FALSE(DocIterator->reachedEnd());
 
   DocIterator->advanceTo(20);
   EXPECT_EQ(DocIterator->peek(), 20U);
-  EXPECT_EQ(DocIterator->reachedEnd(), false);
+  EXPECT_FALSE(DocIterator->reachedEnd());
 
   DocIterator->advanceTo(65);
   EXPECT_EQ(DocIterator->peek(), 100U);
-  EXPECT_EQ(DocIterator->reachedEnd(), false);
+  EXPECT_FALSE(DocIterator->reachedEnd());
 
   DocIterator->advanceTo(420);
-  EXPECT_EQ(DocIterator->reachedEnd(), true);
+  EXPECT_TRUE(DocIterator->reachedEnd());
 }
 
 TEST(DexIndexIterators, AndWithEmpty) {
   const PostingList L0;
   const PostingList L1 = {0, 5, 7, 10, 42, 320, 9000};
 
   auto AndEmpty = createAnd(create(L0));
-  EXPECT_EQ(AndEmpty->reachedEnd(), true);
+  EXPECT_TRUE(AndEmpty->reachedEnd());
 
   auto AndWithEmpty = createAnd(create(L0), create(L1));
-  EXPECT_EQ(AndWithEmpty->reachedEnd(), true);
+  EXPECT_TRUE(AndWithEmpty->reachedEnd());
 
   EXPECT_THAT(consume(*AndWithEmpty), ElementsAre());
 }
@@ -65,7 +65,7 @@
 
   auto And = createAnd(create(L1), create(L0));
 
-  EXPECT_EQ(And->reachedEnd(), false);
+  EXPECT_FALSE(And->reachedEnd());
   EXPECT_THAT(consume(*And), ElementsAre(0U, 7U, 10U, 320U, 9000U));
 
   And = createAnd(create(L0), create(L1));
@@ -94,18 +94,18 @@
   EXPECT_EQ(And->peek(), 320U);
   And->advanceTo(10);
 
-  EXPECT_EQ(And->reachedEnd(), true);
+  EXPECT_TRUE(And->reachedEnd());
 }
 
 TEST(DexIndexIterators, OrWithEmpty) {
   const PostingList L0;
   const PostingList L1 = {0, 5, 7, 10, 42, 320, 9000};
 
   auto OrEmpty = createOr(create(L0));
-  EXPECT_EQ(OrEmpty->reachedEnd(), true);
+  EXPECT_TRUE(OrEmpty->reachedEnd());
 
   auto OrWithEmpty = createOr(create(L0), create(L1));
-  EXPECT_EQ(OrWithEmpty->reachedEnd(), false);
+  EXPECT_FALSE(OrWithEmpty->reachedEnd());
 
   EXPECT_THAT(consume(*OrWithEmpty),
   ElementsAre(0U, 5U, 7U, 10U, 42U, 320U, 9000U));
@@ -117,7 +117,7 @@
 
   auto Or = createOr(create(L0), create(L1));
 
-  EXPECT_EQ(Or->reachedEnd(), false);
+  EXPECT_FALSE(Or->reachedEnd());
   EXPECT_EQ(Or->peek(), 0U);
   Or->advance();
   EXPECT_EQ(Or->peek(), 4U);
@@ -136,7 +136,7 @@
   Or->advanceTo(9000);
   EXPECT_EQ(Or->peek(), 9000U);
   Or->advanceTo(9001);
-  EXPECT_EQ(Or->reachedEnd(), true);
+  EXPECT_TRUE(Or->reachedEnd());
 
   Or = createOr(create(L0), create(L1));
 
@@ -151,7 +151,7 @@
 
   auto Or = createOr(create(L0), create(L1), create(L2));
 
-  EXPECT_EQ(Or->reachedEnd(), false);
+  EXPECT_FALSE(Or->reachedEnd());
   EXPECT_EQ(Or->peek(), 0U);
 
   Or->advance();
@@ -166,7 +166,7 @@
   EXPECT_EQ(Or->peek(), 60U);
 
   Or->advanceTo(9001);
-  EXPECT_EQ(Or->reachedEnd(), true);
+  EXPECT_TRUE(Or->reachedEnd());
 }
 
 // FIXME(kbobyrev): The testcase below is similar to what is expected in real
@@ -208,7 +208,7 @@
   // Lower Or Iterator: [0, 1, 5]
   createOr(create(L2), create(L3), create(L4)));
 
-  EXPECT_EQ(Root->reachedEnd(), false);
+  EXPECT_FALSE(Root->reachedEnd());
   EXPECT_EQ(Root->peek(), 1U);
   Root->advanceTo(0);
   // Advance multiple times. Shouldn't do anything.
@@ -220,7 +220,7 @@
   Root->advanceTo(5);
   EXPECT_EQ(Root->peek(), 5U);
   Root->advanceTo(9000);
-  EXPECT_EQ(Root->reachedEnd(), true);
+  EXPECT_TRUE(Root->reachedEnd());
 }
 
 TEST(DexIndexIterators, StringRepresentation) {
@@ -264,14 +264,14 @@
 
 TEST(DexIndexIterators, True) {
   auto TrueIterator = createTrue(0U);
-  EXPECT_THAT(TrueIterator->reachedEnd(), true);
+  EXPECT_TRUE(TrueIterator->reachedEnd());
   EXPECT_THAT(consume(*TrueIterator), ElementsAre());
 
   PostingList L0 = {1, 2, 5, 7};
   TrueIterator = createTrue(7U);
   EXPECT_THAT(TrueIterator->peek(), 0);
   auto AndIterator = createAnd(create(L0), move(TrueIterator));
-  EXPECT_THAT(AndIterator->reachedEnd(), false);
+  EXPECT_FALSE(AndIterator->reachedEnd());
   EXPECT_THAT(consume(*AndIterator), ElementsAre(1, 2,

[PATCH] D50970: [clangd] Introduce BOOST iterators

2018-08-20 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev created this revision.
kbobyrev added reviewers: ioeric, ilya-biryukov.
kbobyrev added a project: clang-tools-extra.
Herald added subscribers: kadircet, arphaman, jkorous, MaskRay.

https://reviews.llvm.org/D50970

Files:
  clang-tools-extra/clangd/index/dex/Iterator.cpp
  clang-tools-extra/clangd/index/dex/Iterator.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp

Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -175,26 +175,29 @@
 // beneficial to implement automatic generation of query trees for more
 // comprehensive testing.
 TEST(DexIndexIterators, QueryTree) {
-  // An example of more complicated query
   //
   //  +-+
   //  |And Iterator:1, 5|
   //  +++
   //   |
   //   |
-  // ++
+  // +-+--+
   // ||
   // ||
   //  +--v--+  +--v-+
   //  |And Iterator: 1, 5, 9|  |Or Iterator: 0, 1, 5|
   //  +--+--+  +--+-+
   // ||
-  //  +--+-++-+---+
+  //  +--+-++-+
   //  ||| |   |
-  //  +---v-+ +v-+   +--v--++-V--++---v---+
-  //  |1, 3, 5, 8, 9| |1, 5, 7, 9|   |Empty||0, 5||0, 1, 5|
-  //  +-+ +--+   +-++++---+
-
+  //  +---v-+ ++---+ +--v--+  +---v+ +v---+
+  //  |1, 3, 5, 8, 9| |Boost: 2| |Empty|  |Boost: 3| |Boost: 4|
+  //  +-+ ++---+ +-+  +---++ ++---+
+  //   |  |   |
+  //  +v-+  +-v--++---v---+
+  //  |1, 5, 7, 9|  |0, 5||0, 1, 5|
+  //  +--+  +++---+
+  //
   const PostingList L0 = {1, 3, 5, 8, 9};
   const PostingList L1 = {1, 5, 7, 9};
   const PostingList L2 = {0, 5};
@@ -204,21 +207,26 @@
   // Root of the query tree: [1, 5]
   auto Root = createAnd(
   // Lower And Iterator: [1, 5, 9]
-  createAnd(create(L0), create(L1)),
+  createAnd(create(L0), createBoost(create(L1), 2U)),
   // Lower Or Iterator: [0, 1, 5]
-  createOr(create(L2), create(L3), create(L4)));
+  createOr(create(L2), createBoost(create(L3), 3U),
+   createBoost(create(L4), 4U)));
 
   EXPECT_FALSE(Root->reachedEnd());
   EXPECT_EQ(Root->peek(), 1U);
   Root->advanceTo(0);
   // Advance multiple times. Shouldn't do anything.
   Root->advanceTo(1);
   Root->advanceTo(0);
   EXPECT_EQ(Root->peek(), 1U);
+  auto ElementBoost = Root->boost(Root->peek());
+  EXPECT_THAT(ElementBoost, 6);
   Root->advance();
   EXPECT_EQ(Root->peek(), 5U);
   Root->advanceTo(5);
   EXPECT_EQ(Root->peek(), 5U);
+  ElementBoost = Root->boost(Root->peek());
+  EXPECT_THAT(ElementBoost, 8);
   Root->advanceTo(9000);
   EXPECT_TRUE(Root->reachedEnd());
 }
@@ -275,6 +283,34 @@
   EXPECT_THAT(consume(*AndIterator), ElementsAre(1, 2, 5));
 }
 
+TEST(DexIndexIterators, Boost) {
+  auto BoostIterator = createBoost(createTrue(5U), 42U);
+  EXPECT_FALSE(BoostIterator->reachedEnd());
+  auto ElementBoost = BoostIterator->boost(BoostIterator->peek());
+  EXPECT_THAT(ElementBoost, 42U);
+
+  const PostingList L0 = {2, 4};
+  const PostingList L1 = {1, 4};
+  auto Root = createOr(createTrue(5U), createBoost(create(L0), 2U),
+   createBoost(create(L1), 3U));
+
+  ElementBoost = Root->boost(Root->peek());
+  EXPECT_THAT(ElementBoost, Iterator::DEFAULT_BOOST_SCORE);
+  Root->advance();
+  EXPECT_THAT(Root->peek(), 1U);
+  ElementBoost = Root->boost(Root->peek());
+  EXPECT_THAT(ElementBoost, 3);
+
+  Root->advance();
+  EXPECT_THAT(Root->peek(), 2U);
+  ElementBoost = Root->boost(Root->peek());
+  EXPECT_THAT(ElementBoost, 2);
+
+  Root->advanceTo(4);
+  ElementBoost = Root->boost(Root->peek());
+  EXPECT_THAT(ElementBoost, 3);
+}
+
 testing::Matcher>
 trigramsAre(std::initializer_list Trigrams) {
   std::vector Tokens;
Index: clang-tools-extra/clangd/index/dex/Iterator.h
===
--- clang-tools-extra/clangd/index/dex/Iterator.h
+++ clang-tools-extra/clangd/index/dex/Iterator.h
@@ -89,6 +89,8 @@
   ///

[PATCH] D50970: [clangd] Introduce BOOST iterators

2018-08-20 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev planned changes to this revision.
kbobyrev added a comment.

This patch is in preview mode, documentation overhaul and minor cleanup 
incoming.


https://reviews.llvm.org/D50970



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D50970: [clangd] Implement BOOST iterator

2018-08-20 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 161477.
kbobyrev retitled this revision from "[clangd] Introduce BOOST iterators" to 
"[clangd] Implement BOOST iterator".
kbobyrev edited the summary of this revision.
kbobyrev added a comment.

Add documentation, cleanup tests.


https://reviews.llvm.org/D50970

Files:
  clang-tools-extra/clangd/index/dex/Iterator.cpp
  clang-tools-extra/clangd/index/dex/Iterator.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp

Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -172,53 +172,61 @@
 // FIXME(kbobyrev): The testcase below is similar to what is expected in real
 // queries. It should be updated once new iterators (such as boosting, limiting,
 // etc iterators) appear. However, it is not exhaustive and it would be
-// beneficial to implement automatic generation of query trees for more
-// comprehensive testing.
+// beneficial to implement automatic generation (e.g. fuzzing) of query trees
+// for more comprehensive testing.
 TEST(DexIndexIterators, QueryTree) {
-  // An example of more complicated query
   //
   //  +-+
   //  |And Iterator:1, 5|
   //  +++
   //   |
   //   |
-  // ++
+  // +-+--+
   // ||
   // ||
-  //  +--v--+  +--v-+
-  //  |And Iterator: 1, 5, 9|  |Or Iterator: 0, 1, 5|
-  //  +--+--+  +--+-+
+  //  +--v--+  +--v+
+  //  |And Iterator: 1, 5, 9|  |Or Iterator: 0, 1, 3, 5|
+  //  +--+--+  +--++
   // ||
-  //  +--+-++-+---+
+  //  +--+-++-+
   //  ||| |   |
-  //  +---v-+ +v-+   +--v--++-V--++---v---+
-  //  |1, 3, 5, 8, 9| |1, 5, 7, 9|   |Empty||0, 5||0, 1, 5|
-  //  +-+ +--+   +-++++---+
-
+  //  +---v-+ ++---+ +--v--+  +---v+ +v---+
+  //  |1, 3, 5, 8, 9| |Boost: 2| |Empty|  |Boost: 3| |Boost: 4|
+  //  +-+ ++---+ +-+  +---++ ++---+
+  //   |  |   |
+  //  +v-+  +-v--++---v---+
+  //  |1, 5, 7, 9|  |1, 5||0, 3, 5|
+  //  +--+  +++---+
+  //
   const PostingList L0 = {1, 3, 5, 8, 9};
   const PostingList L1 = {1, 5, 7, 9};
-  const PostingList L2 = {0, 5};
-  const PostingList L3 = {0, 1, 5};
-  const PostingList L4;
+  const PostingList L3;
+  const PostingList L4 = {1, 5};
+  const PostingList L5 = {0, 3, 5};
 
   // Root of the query tree: [1, 5]
   auto Root = createAnd(
   // Lower And Iterator: [1, 5, 9]
-  createAnd(create(L0), create(L1)),
+  createAnd(create(L0), createBoost(create(L1), 2U)),
   // Lower Or Iterator: [0, 1, 5]
-  createOr(create(L2), create(L3), create(L4)));
+  createOr(create(L3), createBoost(create(L4), 3U),
+   createBoost(create(L5), 4U)));
 
   EXPECT_FALSE(Root->reachedEnd());
   EXPECT_EQ(Root->peek(), 1U);
   Root->advanceTo(0);
   // Advance multiple times. Shouldn't do anything.
   Root->advanceTo(1);
   Root->advanceTo(0);
   EXPECT_EQ(Root->peek(), 1U);
+  auto ElementBoost = Root->boost(Root->peek());
+  EXPECT_THAT(ElementBoost, 6);
   Root->advance();
   EXPECT_EQ(Root->peek(), 5U);
   Root->advanceTo(5);
   EXPECT_EQ(Root->peek(), 5U);
+  ElementBoost = Root->boost(Root->peek());
+  EXPECT_THAT(ElementBoost, 8);
   Root->advanceTo(9000);
   EXPECT_TRUE(Root->reachedEnd());
 }
@@ -275,6 +283,34 @@
   EXPECT_THAT(consume(*AndIterator), ElementsAre(1, 2, 5));
 }
 
+TEST(DexIndexIterators, Boost) {
+  auto BoostIterator = createBoost(createTrue(5U), 42U);
+  EXPECT_FALSE(BoostIterator->reachedEnd());
+  auto ElementBoost = BoostIterator->boost(BoostIterator->peek());
+  EXPECT_THAT(ElementBoost, 42U);
+
+  const PostingList L0 = {2, 4};
+  const PostingList L1 = {1, 4};
+  auto Root = createOr(createTrue(5U), createBoost(create(L0), 2U),
+   createBoost(create(L1), 3U));
+
+  ElementBoost = Root->b

[PATCH] D50337: [clangd] DexIndex implementation prototype

2018-08-20 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 161480.
kbobyrev marked 6 inline comments as done.
kbobyrev added a comment.

Address post-LGTM comments.


https://reviews.llvm.org/D50337

Files:
  clang-tools-extra/clangd/CMakeLists.txt
  clang-tools-extra/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/clangd/index/dex/DexIndex.h
  clang-tools-extra/clangd/index/dex/Token.h
  clang-tools-extra/unittests/clangd/CMakeLists.txt
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp
  clang-tools-extra/unittests/clangd/IndexTests.cpp
  clang-tools-extra/unittests/clangd/TestIndex.cpp
  clang-tools-extra/unittests/clangd/TestIndex.h

Index: clang-tools-extra/unittests/clangd/TestIndex.h
===
--- /dev/null
+++ clang-tools-extra/unittests/clangd/TestIndex.h
@@ -0,0 +1,64 @@
+//===-- IndexHelpers.h --*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#ifndef LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_INDEXTESTCOMMON_H
+#define LLVM_CLANG_TOOLS_EXTRA_UNITTESTS_CLANGD_INDEXTESTCOMMON_H
+
+#include "index/Index.h"
+#include "index/Merge.h"
+#include "index/dex/DexIndex.h"
+#include "index/dex/Iterator.h"
+#include "index/dex/Token.h"
+#include "index/dex/Trigram.h"
+
+namespace clang {
+namespace clangd {
+
+// Creates Symbol instance and sets SymbolID to given QualifiedName.
+Symbol symbol(llvm::StringRef QName);
+
+// Bundles symbol pointers with the actual symbol slab the pointers refer to in
+// order to ensure that the slab isn't destroyed while it's used by and index.
+struct SlabAndPointers {
+  SymbolSlab Slab;
+  std::vector Pointers;
+};
+
+// Create a slab of symbols with the given qualified names as both IDs and
+// names. The life time of the slab is managed by the returned shared pointer.
+// If \p WeakSymbols is provided, it will be pointed to the managed object in
+// the returned shared pointer.
+std::shared_ptr>
+generateSymbols(std::vector QualifiedNames,
+std::weak_ptr *WeakSymbols = nullptr);
+
+// Create a slab of symbols with IDs and names [Begin, End], otherwise identical
+// to the `generateSymbols` above.
+std::shared_ptr>
+generateNumSymbols(int Begin, int End,
+   std::weak_ptr *WeakSymbols = nullptr);
+
+// Returns fully-qualified name out of given symbol.
+std::string getQualifiedName(const Symbol &Sym);
+
+// Performs fuzzy matching-based symbol lookup given a query and an index.
+// Incomplete is set true if more items than requested can be retrieved, false
+// otherwise.
+std::vector match(const SymbolIndex &I,
+   const FuzzyFindRequest &Req,
+   bool *Incomplete = nullptr);
+
+// Returns qualified names of symbols with any of IDs in the index.
+std::vector lookup(const SymbolIndex &I,
+llvm::ArrayRef IDs);
+
+} // namespace clangd
+} // namespace clang
+
+#endif
Index: clang-tools-extra/unittests/clangd/TestIndex.cpp
===
--- /dev/null
+++ clang-tools-extra/unittests/clangd/TestIndex.cpp
@@ -0,0 +1,83 @@
+//===-- IndexHelpers.cpp *- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "TestIndex.h"
+
+namespace clang {
+namespace clangd {
+
+Symbol symbol(llvm::StringRef QName) {
+  Symbol Sym;
+  Sym.ID = SymbolID(QName.str());
+  size_t Pos = QName.rfind("::");
+  if (Pos == llvm::StringRef::npos) {
+Sym.Name = QName;
+Sym.Scope = "";
+  } else {
+Sym.Name = QName.substr(Pos + 2);
+Sym.Scope = QName.substr(0, Pos + 2);
+  }
+  return Sym;
+}
+
+std::shared_ptr>
+generateSymbols(std::vector QualifiedNames,
+std::weak_ptr *WeakSymbols) {
+  SymbolSlab::Builder Slab;
+  for (llvm::StringRef QName : QualifiedNames)
+Slab.insert(symbol(QName));
+
+  auto Storage = std::make_shared();
+  Storage->Slab = std::move(Slab).build();
+  for (const auto &Sym : Storage->Slab)
+Storage->Pointers.push_back(&Sym);
+  if (WeakSymbols)
+*WeakSymbols = Storage;
+  auto *Pointers = &Storage->Pointers;
+  return {std::move(Storage), Pointers};
+}
+
+std::shared_ptr>
+generateNumSymbols(int Begin, int End,
+   std::weak_ptr *WeakSymbols) {
+  std::vector Names;
+  for (int i = Begin; i <= End; i++)
+Names.push_back(std::to_string(i));
+  return generateSymbols(Names, WeakSymbols);
+}
+
+std::string getQualifiedName(const Symbol &Sym) {
+  return (Sym.Scope + Sym.N

[PATCH] D50337: [clangd] DexIndex implementation prototype

2018-08-20 Thread Kirill Bobyrev via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL340175: [clangd] DexIndex implementation prototype (authored 
by omtcyfz, committed by ).
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D50337?vs=161480&id=161481#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D50337

Files:
  clang-tools-extra/trunk/clangd/CMakeLists.txt
  clang-tools-extra/trunk/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/trunk/clangd/index/dex/DexIndex.h
  clang-tools-extra/trunk/clangd/index/dex/Token.h
  clang-tools-extra/trunk/unittests/clangd/CMakeLists.txt
  clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp
  clang-tools-extra/trunk/unittests/clangd/IndexTests.cpp
  clang-tools-extra/trunk/unittests/clangd/TestIndex.cpp
  clang-tools-extra/trunk/unittests/clangd/TestIndex.h

Index: clang-tools-extra/trunk/unittests/clangd/TestIndex.cpp
===
--- clang-tools-extra/trunk/unittests/clangd/TestIndex.cpp
+++ clang-tools-extra/trunk/unittests/clangd/TestIndex.cpp
@@ -0,0 +1,83 @@
+//===-- IndexHelpers.cpp *- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+
+#include "TestIndex.h"
+
+namespace clang {
+namespace clangd {
+
+Symbol symbol(llvm::StringRef QName) {
+  Symbol Sym;
+  Sym.ID = SymbolID(QName.str());
+  size_t Pos = QName.rfind("::");
+  if (Pos == llvm::StringRef::npos) {
+Sym.Name = QName;
+Sym.Scope = "";
+  } else {
+Sym.Name = QName.substr(Pos + 2);
+Sym.Scope = QName.substr(0, Pos + 2);
+  }
+  return Sym;
+}
+
+std::shared_ptr>
+generateSymbols(std::vector QualifiedNames,
+std::weak_ptr *WeakSymbols) {
+  SymbolSlab::Builder Slab;
+  for (llvm::StringRef QName : QualifiedNames)
+Slab.insert(symbol(QName));
+
+  auto Storage = std::make_shared();
+  Storage->Slab = std::move(Slab).build();
+  for (const auto &Sym : Storage->Slab)
+Storage->Pointers.push_back(&Sym);
+  if (WeakSymbols)
+*WeakSymbols = Storage;
+  auto *Pointers = &Storage->Pointers;
+  return {std::move(Storage), Pointers};
+}
+
+std::shared_ptr>
+generateNumSymbols(int Begin, int End,
+   std::weak_ptr *WeakSymbols) {
+  std::vector Names;
+  for (int i = Begin; i <= End; i++)
+Names.push_back(std::to_string(i));
+  return generateSymbols(Names, WeakSymbols);
+}
+
+std::string getQualifiedName(const Symbol &Sym) {
+  return (Sym.Scope + Sym.Name).str();
+}
+
+std::vector match(const SymbolIndex &I,
+   const FuzzyFindRequest &Req, bool *Incomplete) {
+  std::vector Matches;
+  bool IsIncomplete = I.fuzzyFind(Req, [&](const Symbol &Sym) {
+Matches.push_back(clang::clangd::getQualifiedName(Sym));
+  });
+  if (Incomplete)
+*Incomplete = IsIncomplete;
+  return Matches;
+}
+
+// Returns qualified names of symbols with any of IDs in the index.
+std::vector lookup(const SymbolIndex &I,
+llvm::ArrayRef IDs) {
+  LookupRequest Req;
+  Req.IDs.insert(IDs.begin(), IDs.end());
+  std::vector Results;
+  I.lookup(Req, [&](const Symbol &Sym) {
+Results.push_back(getQualifiedName(Sym));
+  });
+  return Results;
+}
+
+} // namespace clangd
+} // namespace clang
Index: clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp
@@ -7,6 +7,10 @@
 //
 //===--===//
 
+#include "TestIndex.h"
+#include "index/Index.h"
+#include "index/Merge.h"
+#include "index/dex/DexIndex.h"
 #include "index/dex/Iterator.h"
 #include "index/dex/Token.h"
 #include "index/dex/Trigram.h"
@@ -17,11 +21,13 @@
 #include 
 #include 
 
+using ::testing::ElementsAre;
+using ::testing::UnorderedElementsAre;
+
 namespace clang {
 namespace clangd {
 namespace dex {
-
-using ::testing::ElementsAre;
+namespace {
 
 TEST(DexIndexIterators, DocumentIterator) {
   const PostingList L = {4, 7, 8, 20, 42, 100};
@@ -359,6 +365,175 @@
"hij", "ijk", "jkl", "klm"}));
 }
 
+TEST(DexIndex, Lookup) {
+  DexIndex I;
+  I.build(generateSymbols({"ns::abc", "ns::xyz"}));
+  EXPECT_THAT(lookup(I, SymbolID("ns::abc")), UnorderedElementsAre("ns::abc"));
+  EXPECT_THAT(lookup(I, {SymbolID("ns::abc"), SymbolID("ns::xyz")}),
+  UnorderedElementsAre("ns::abc", "ns::xyz"));
+  EXPECT_THAT(lookup(I, {SymbolID("ns::nonono"), SymbolID("ns::xyz")}),
+  UnorderedElementsAre("ns::xyz"));
+  EXPECT_THAT(lookup(I, SymbolID("ns::nonono")), UnorderedElements

[PATCH] D50897: [clangd] Allow using experimental Dex index

2018-08-20 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 161487.
kbobyrev marked an inline comment as done.

https://reviews.llvm.org/D50897

Files:
  clang-tools-extra/clangd/index/MemIndex.cpp
  clang-tools-extra/clangd/index/MemIndex.h
  clang-tools-extra/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/clangd/index/dex/DexIndex.h
  clang-tools-extra/clangd/tool/ClangdMain.cpp

Index: clang-tools-extra/clangd/tool/ClangdMain.cpp
===
--- clang-tools-extra/clangd/tool/ClangdMain.cpp
+++ clang-tools-extra/clangd/tool/ClangdMain.cpp
@@ -12,6 +12,7 @@
 #include "Path.h"
 #include "Trace.h"
 #include "index/SymbolYAML.h"
+#include "index/dex/DexIndex.h"
 #include "clang/Basic/Version.h"
 #include "llvm/Support/CommandLine.h"
 #include "llvm/Support/FileSystem.h"
@@ -28,27 +29,6 @@
 using namespace clang;
 using namespace clang::clangd;
 
-namespace {
-enum class PCHStorageFlag { Disk, Memory };
-
-// Build an in-memory static index for global symbols from a YAML-format file.
-// The size of global symbols should be relatively small, so that all symbols
-// can be managed in memory.
-std::unique_ptr buildStaticIndex(llvm::StringRef YamlSymbolFile) {
-  auto Buffer = llvm::MemoryBuffer::getFile(YamlSymbolFile);
-  if (!Buffer) {
-llvm::errs() << "Can't open " << YamlSymbolFile << "\n";
-return nullptr;
-  }
-  auto Slab = symbolsFromYAML(Buffer.get()->getBuffer());
-  SymbolSlab::Builder SymsBuilder;
-  for (auto Sym : Slab)
-SymsBuilder.insert(Sym);
-
-  return MemIndex::build(std::move(SymsBuilder).build());
-}
-} // namespace
-
 static llvm::cl::opt CompileCommandsDir(
 "compile-commands-dir",
 llvm::cl::desc("Specify a path to look for compile_commands.json. If path "
@@ -113,6 +93,30 @@
 "Intended to simplify lit tests."),
 llvm::cl::init(false), llvm::cl::Hidden);
 
+namespace {
+
+enum class PCHStorageFlag { Disk, Memory };
+
+// Build an in-memory static index for global symbols from a YAML-format file.
+// The size of global symbols should be relatively small, so that all symbols
+// can be managed in memory.
+std::unique_ptr buildStaticIndex(llvm::StringRef YamlSymbolFile) {
+  auto Buffer = llvm::MemoryBuffer::getFile(YamlSymbolFile);
+  if (!Buffer) {
+llvm::errs() << "Can't open " << YamlSymbolFile << "\n";
+return nullptr;
+  }
+  auto Slab = symbolsFromYAML(Buffer.get()->getBuffer());
+  SymbolSlab::Builder SymsBuilder;
+  for (auto Sym : Slab)
+SymsBuilder.insert(Sym);
+
+  return UseDex ? DexIndex::build(std::move(SymsBuilder).build())
+: MemIndex::build(std::move(SymsBuilder).build());
+}
+
+} // namespace
+
 static llvm::cl::opt PCHStorage(
 "pch-storage",
 llvm::cl::desc("Storing PCHs in memory increases memory usages, but may "
@@ -185,6 +189,11 @@
 "'compile_commands.json' files")),
 llvm::cl::init(FilesystemCompileArgs), llvm::cl::Hidden);
 
+static llvm::cl::opt
+UseDex("use-dex-index",
+   llvm::cl::desc("Use experimental Dex static index."),
+   llvm::cl::init(false), llvm::cl::Hidden);
+
 int main(int argc, char *argv[]) {
   llvm::sys::PrintStackTraceOnErrorSignal(argv[0]);
   llvm::cl::SetVersionPrinter([](llvm::raw_ostream &OS) {
Index: clang-tools-extra/clangd/index/dex/DexIndex.h
===
--- clang-tools-extra/clangd/index/dex/DexIndex.h
+++ clang-tools-extra/clangd/index/dex/DexIndex.h
@@ -43,6 +43,9 @@
   /// accessible as long as `Symbols` is kept alive.
   void build(std::shared_ptr> Symbols);
 
+  /// \brief Build index from a symbol slab.
+  static std::unique_ptr build(SymbolSlab Slab);
+
   bool
   fuzzyFind(const FuzzyFindRequest &Req,
 llvm::function_ref Callback) const override;
Index: clang-tools-extra/clangd/index/dex/DexIndex.cpp
===
--- clang-tools-extra/clangd/index/dex/DexIndex.cpp
+++ clang-tools-extra/clangd/index/dex/DexIndex.cpp
@@ -69,6 +69,12 @@
   }
 }
 
+std::unique_ptr DexIndex::build(SymbolSlab Slab) {
+  auto Idx = llvm::make_unique();
+  Idx->build(getSymbolsFromSlab(std::move(Slab)));
+  return std::move(Idx);
+}
+
 /// Constructs iterators over tokens extracted from the query and exhausts it
 /// while applying Callback to each symbol in the order of decreasing quality
 /// of the matched symbols.
Index: clang-tools-extra/clangd/index/MemIndex.h
===
--- clang-tools-extra/clangd/index/MemIndex.h
+++ clang-tools-extra/clangd/index/MemIndex.h
@@ -47,6 +47,10 @@
   mutable std::mutex Mutex;
 };
 
+// FIXME(kbobyrev): Document this one.
+std::shared_ptr>
+getSymbolsFromSlab(SymbolSlab Slab);
+
 } // namespace clangd
 } // namespace clang
 
Index: clang-tools-extra/clangd/index/MemIndex.cpp
===
--- clang-tools-extra/clangd/index/

[PATCH] D51029: [clangd] Implement LIMIT iterator

2018-08-21 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev created this revision.
kbobyrev added reviewers: ioeric, ilya-biryukov.
kbobyrev added a project: clang-tools-extra.
Herald added subscribers: kadircet, arphaman, jkorous, MaskRay.

https://reviews.llvm.org/D51029

Files:
  clang-tools-extra/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/clangd/index/dex/Iterator.cpp
  clang-tools-extra/clangd/index/dex/Iterator.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp

Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -62,7 +62,7 @@
   auto AndWithEmpty = createAnd(create(L0), create(L1));
   EXPECT_TRUE(AndWithEmpty->reachedEnd());
 
-  EXPECT_THAT(consume(*AndWithEmpty), ElementsAre());
+  EXPECT_THAT(consume(move(AndWithEmpty)), ElementsAre());
 }
 
 TEST(DexIndexIterators, AndTwoLists) {
@@ -72,7 +72,7 @@
   auto And = createAnd(create(L1), create(L0));
 
   EXPECT_FALSE(And->reachedEnd());
-  EXPECT_THAT(consume(*And), ElementsAre(0U, 7U, 10U, 320U, 9000U));
+  EXPECT_THAT(consume(move(And)), ElementsAre(0U, 7U, 10U, 320U, 9000U));
 
   And = createAnd(create(L0), create(L1));
 
@@ -113,7 +113,7 @@
   auto OrWithEmpty = createOr(create(L0), create(L1));
   EXPECT_FALSE(OrWithEmpty->reachedEnd());
 
-  EXPECT_THAT(consume(*OrWithEmpty),
+  EXPECT_THAT(consume(move(OrWithEmpty)),
   ElementsAre(0U, 5U, 7U, 10U, 42U, 320U, 9000U));
 }
 
@@ -146,7 +146,7 @@
 
   Or = createOr(create(L0), create(L1));
 
-  EXPECT_THAT(consume(*Or),
+  EXPECT_THAT(consume(move(Or)),
   ElementsAre(0U, 4U, 5U, 7U, 10U, 30U, 42U, 60U, 320U, 9000U));
 }
 
@@ -256,29 +256,29 @@
   const PostingList L5;
 
   auto DocIterator = create(L0);
-  EXPECT_THAT(consume(*DocIterator, 42), ElementsAre(4, 7, 8, 20, 42, 100));
+  EXPECT_THAT(consume(move(DocIterator), 42), ElementsAre(4, 7, 8, 20, 42, 100));
 
   DocIterator = create(L0);
-  EXPECT_THAT(consume(*DocIterator), ElementsAre(4, 7, 8, 20, 42, 100));
+  EXPECT_THAT(consume(move(DocIterator)), ElementsAre(4, 7, 8, 20, 42, 100));
 
   DocIterator = create(L0);
-  EXPECT_THAT(consume(*DocIterator, 3), ElementsAre(4, 7, 8));
+  EXPECT_THAT(consume(move(DocIterator), 3), ElementsAre(4, 7, 8));
 
   DocIterator = create(L0);
-  EXPECT_THAT(consume(*DocIterator, 0), ElementsAre());
+  EXPECT_THAT(consume(move(DocIterator), 0), ElementsAre());
 }
 
 TEST(DexIndexIterators, True) {
   auto TrueIterator = createTrue(0U);
   EXPECT_TRUE(TrueIterator->reachedEnd());
-  EXPECT_THAT(consume(*TrueIterator), ElementsAre());
+  EXPECT_THAT(consume(move(TrueIterator)), ElementsAre());
 
   PostingList L0 = {1, 2, 5, 7};
   TrueIterator = createTrue(7U);
   EXPECT_THAT(TrueIterator->peek(), 0);
   auto AndIterator = createAnd(create(L0), move(TrueIterator));
   EXPECT_FALSE(AndIterator->reachedEnd());
-  EXPECT_THAT(consume(*AndIterator), ElementsAre(1, 2, 5));
+  EXPECT_THAT(consume(move(AndIterator)), ElementsAre(1, 2, 5));
 }
 
 testing::Matcher>
Index: clang-tools-extra/clangd/index/dex/Iterator.h
===
--- clang-tools-extra/clangd/index/dex/Iterator.h
+++ clang-tools-extra/clangd/index/dex/Iterator.h
@@ -89,6 +89,13 @@
   ///
   /// Note: reachedEnd() must be false.
   virtual DocID peek() const = 0;
+  /// Marks an element as "consumed" and triggers limit decrement for every
+  /// LIMIT iterator which yields given item. Root iterator should call
+  /// consume(peek) as the argument is used to propagate the tree, otherwise
+  /// the behavior will be incorrect.
+  virtual void consume(DocID ID) = 0;
+  /// Returns true if the iterator can retrieve more items upon request.
+  virtual bool canRetrieveMore() const { return !reachedEnd(); }
 
   virtual ~Iterator() {}
 
@@ -113,7 +120,7 @@
 
 /// Advances the iterator until it is either exhausted or the number of
 /// requested items is reached. The result contains sorted DocumentIDs.
-std::vector consume(Iterator &It,
+std::vector consume(std::unique_ptr It,
size_t Limit = std::numeric_limits::max());
 
 /// Returns a document iterator over given PostingList.
@@ -133,6 +140,11 @@
 /// all items in range [0, Size) in an efficient manner.
 std::unique_ptr createTrue(DocID Size);
 
+/// Returns LIMIT iterator, which ...
+// FIXME(kbobyrev): Add docs.
+std::unique_ptr
+createLimit(std::unique_ptr Child, DocID Size);
+
 /// This allows createAnd(create(...), create(...)) syntax.
 template  std::unique_ptr createAnd(Args... args) {
   std::vector> Children;
Index: clang-tools-extra/clangd/index/dex/Iterator.cpp
===
--- clang-tools-extra/clangd/index/dex/Iterator.cpp
+++ clang-tools-extra/clangd/index/dex/Iterator.cpp
@@ -46,6 +46,8 @@
 return *Index;
   }
 
+  void consume (DocID ID) override {}
+
 private:
   llvm::raw

[PATCH] D51029: [clangd] Implement LIMIT iterator

2018-08-21 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev planned changes to this revision.
kbobyrev added a comment.

Patch is in the preview mode, I will add documentation and more tests before 
opening a review.


https://reviews.llvm.org/D51029



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D51029: [clangd] Implement LIMIT iterator

2018-08-21 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 161677.
kbobyrev added a comment.

Add comprehensive tests, improve documentation.


https://reviews.llvm.org/D51029

Files:
  clang-tools-extra/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/clangd/index/dex/Iterator.cpp
  clang-tools-extra/clangd/index/dex/Iterator.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp

Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -62,7 +62,7 @@
   auto AndWithEmpty = createAnd(create(L0), create(L1));
   EXPECT_TRUE(AndWithEmpty->reachedEnd());
 
-  EXPECT_THAT(consume(*AndWithEmpty), ElementsAre());
+  EXPECT_THAT(consume(move(AndWithEmpty)), ElementsAre());
 }
 
 TEST(DexIndexIterators, AndTwoLists) {
@@ -72,7 +72,7 @@
   auto And = createAnd(create(L1), create(L0));
 
   EXPECT_FALSE(And->reachedEnd());
-  EXPECT_THAT(consume(*And), ElementsAre(0U, 7U, 10U, 320U, 9000U));
+  EXPECT_THAT(consume(move(And)), ElementsAre(0U, 7U, 10U, 320U, 9000U));
 
   And = createAnd(create(L0), create(L1));
 
@@ -113,7 +113,7 @@
   auto OrWithEmpty = createOr(create(L0), create(L1));
   EXPECT_FALSE(OrWithEmpty->reachedEnd());
 
-  EXPECT_THAT(consume(*OrWithEmpty),
+  EXPECT_THAT(consume(move(OrWithEmpty)),
   ElementsAre(0U, 5U, 7U, 10U, 42U, 320U, 9000U));
 }
 
@@ -146,7 +146,7 @@
 
   Or = createOr(create(L0), create(L1));
 
-  EXPECT_THAT(consume(*Or),
+  EXPECT_THAT(consume(move(Or)),
   ElementsAre(0U, 4U, 5U, 7U, 10U, 30U, 42U, 60U, 320U, 9000U));
 }
 
@@ -248,37 +248,40 @@
 }
 
 TEST(DexIndexIterators, Limit) {
-  const PostingList L0 = {4, 7, 8, 20, 42, 100};
-  const PostingList L1 = {1, 3, 5, 8, 9};
-  const PostingList L2 = {1, 5, 7, 9};
-  const PostingList L3 = {0, 5};
-  const PostingList L4 = {0, 1, 5};
-  const PostingList L5;
+  const PostingList L0 = {3, 6, 7, 20, 42, 100};
+  const PostingList L1 = {1, 3, 5, 6, 7, 30, 100};
+  const PostingList L2 = {0, 3, 5, 7, 8, 100};
 
   auto DocIterator = create(L0);
-  EXPECT_THAT(consume(*DocIterator, 42), ElementsAre(4, 7, 8, 20, 42, 100));
+  EXPECT_THAT(consume(move(DocIterator), 42),
+  ElementsAre(3, 6, 7, 20, 42, 100));
 
   DocIterator = create(L0);
-  EXPECT_THAT(consume(*DocIterator), ElementsAre(4, 7, 8, 20, 42, 100));
+  EXPECT_THAT(consume(move(DocIterator)), ElementsAre(3, 6, 7, 20, 42, 100));
 
   DocIterator = create(L0);
-  EXPECT_THAT(consume(*DocIterator, 3), ElementsAre(4, 7, 8));
+  EXPECT_THAT(consume(move(DocIterator), 3), ElementsAre(3, 6, 7));
 
   DocIterator = create(L0);
-  EXPECT_THAT(consume(*DocIterator, 0), ElementsAre());
+  EXPECT_THAT(consume(move(DocIterator), 0), ElementsAre());
+
+  auto AndIterator =
+  createAnd(createLimit(createTrue(9000), 343), createLimit(create(L0), 2),
+createLimit(create(L1), 3), createLimit(create(L2), 42));
+  EXPECT_THAT(consume(move(AndIterator)), ElementsAre(3, 7));
 }
 
 TEST(DexIndexIterators, True) {
   auto TrueIterator = createTrue(0U);
   EXPECT_TRUE(TrueIterator->reachedEnd());
-  EXPECT_THAT(consume(*TrueIterator), ElementsAre());
+  EXPECT_THAT(consume(move(TrueIterator)), ElementsAre());
 
   PostingList L0 = {1, 2, 5, 7};
   TrueIterator = createTrue(7U);
   EXPECT_THAT(TrueIterator->peek(), 0);
   auto AndIterator = createAnd(create(L0), move(TrueIterator));
   EXPECT_FALSE(AndIterator->reachedEnd());
-  EXPECT_THAT(consume(*AndIterator), ElementsAre(1, 2, 5));
+  EXPECT_THAT(consume(move(AndIterator)), ElementsAre(1, 2, 5));
 }
 
 testing::Matcher>
Index: clang-tools-extra/clangd/index/dex/Iterator.h
===
--- clang-tools-extra/clangd/index/dex/Iterator.h
+++ clang-tools-extra/clangd/index/dex/Iterator.h
@@ -89,6 +89,13 @@
   ///
   /// Note: reachedEnd() must be false.
   virtual DocID peek() const = 0;
+  /// Marks an element as "consumed" and triggers limit decrement for every
+  /// LIMIT iterator which yields given item. Root iterator should call
+  /// consume(peek) as the argument is used to propagate the tree, otherwise
+  /// the behavior will be incorrect.
+  virtual void consume(DocID ID) = 0;
+  /// Returns true if the iterator can retrieve more items upon request.
+  virtual bool canRetrieveMore() const { return !reachedEnd(); }
 
   virtual ~Iterator() {}
 
@@ -113,7 +120,7 @@
 
 /// Advances the iterator until it is either exhausted or the number of
 /// requested items is reached. The result contains sorted DocumentIDs.
-std::vector consume(Iterator &It,
+std::vector consume(std::unique_ptr It,
size_t Limit = std::numeric_limits::max());
 
 /// Returns a document iterator over given PostingList.
@@ -133,6 +140,11 @@
 /// all items in range [0, Size) in an efficient manner.
 std::unique_ptr createTrue(DocID Size);
 
+/// Returns LIMIT iterator, 

[PATCH] D51029: [clangd] Implement LIMIT iterator

2018-08-21 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev planned changes to this revision.
kbobyrev added a comment.

Since it's bundled with the BOOST iterator and doesn't make too much sense 
without it, I should probable rebase on top of https://reviews.llvm.org/D50970 
and add it as the parent revision.


https://reviews.llvm.org/D51029



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D50897: [clangd] Allow using experimental Dex index

2018-08-21 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 161679.
kbobyrev marked 2 inline comments as done.
kbobyrev added a comment.

Aww, the previous diff was the wrong one and didn't contain docs.

The move of the code to the middle of Clangd driver was justified by the 
assumption that it might be better to move the code that uses `UseDex` flag 
closer to the flag itself, but I agree: it looks strange. I moved it back.


https://reviews.llvm.org/D50897

Files:
  clang-tools-extra/clangd/index/MemIndex.cpp
  clang-tools-extra/clangd/index/MemIndex.h
  clang-tools-extra/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/clangd/index/dex/DexIndex.h
  clang-tools-extra/clangd/tool/ClangdMain.cpp

Index: clang-tools-extra/clangd/tool/ClangdMain.cpp
===
--- clang-tools-extra/clangd/tool/ClangdMain.cpp
+++ clang-tools-extra/clangd/tool/ClangdMain.cpp
@@ -12,6 +12,7 @@
 #include "Path.h"
 #include "Trace.h"
 #include "index/SymbolYAML.h"
+#include "index/dex/DexIndex.h"
 #include "clang/Basic/Version.h"
 #include "llvm/Support/CommandLine.h"
 #include "llvm/Support/FileSystem.h"
@@ -29,6 +30,7 @@
 using namespace clang::clangd;
 
 namespace {
+
 enum class PCHStorageFlag { Disk, Memory };
 
 // Build an in-memory static index for global symbols from a YAML-format file.
@@ -45,8 +47,10 @@
   for (auto Sym : Slab)
 SymsBuilder.insert(Sym);
 
-  return MemIndex::build(std::move(SymsBuilder).build());
+  return UseDex ? DexIndex::build(std::move(SymsBuilder).build())
+: MemIndex::build(std::move(SymsBuilder).build());
 }
+
 } // namespace
 
 static llvm::cl::opt CompileCommandsDir(
@@ -185,6 +189,11 @@
 "'compile_commands.json' files")),
 llvm::cl::init(FilesystemCompileArgs), llvm::cl::Hidden);
 
+static llvm::cl::opt
+UseDex("use-dex-index",
+   llvm::cl::desc("Use experimental Dex static index."),
+   llvm::cl::init(false), llvm::cl::Hidden);
+
 int main(int argc, char *argv[]) {
   llvm::sys::PrintStackTraceOnErrorSignal(argv[0]);
   llvm::cl::SetVersionPrinter([](llvm::raw_ostream &OS) {
Index: clang-tools-extra/clangd/index/dex/DexIndex.h
===
--- clang-tools-extra/clangd/index/dex/DexIndex.h
+++ clang-tools-extra/clangd/index/dex/DexIndex.h
@@ -43,6 +43,9 @@
   /// accessible as long as `Symbols` is kept alive.
   void build(std::shared_ptr> Symbols);
 
+  /// \brief Build index from a symbol slab.
+  static std::unique_ptr build(SymbolSlab Slab);
+
   bool
   fuzzyFind(const FuzzyFindRequest &Req,
 llvm::function_ref Callback) const override;
Index: clang-tools-extra/clangd/index/dex/DexIndex.cpp
===
--- clang-tools-extra/clangd/index/dex/DexIndex.cpp
+++ clang-tools-extra/clangd/index/dex/DexIndex.cpp
@@ -69,6 +69,12 @@
   }
 }
 
+std::unique_ptr DexIndex::build(SymbolSlab Slab) {
+  auto Idx = llvm::make_unique();
+  Idx->build(getSymbolsFromSlab(std::move(Slab)));
+  return std::move(Idx);
+}
+
 /// Constructs iterators over tokens extracted from the query and exhausts it
 /// while applying Callback to each symbol in the order of decreasing quality
 /// of the matched symbols.
Index: clang-tools-extra/clangd/index/MemIndex.h
===
--- clang-tools-extra/clangd/index/MemIndex.h
+++ clang-tools-extra/clangd/index/MemIndex.h
@@ -47,6 +47,11 @@
   mutable std::mutex Mutex;
 };
 
+// Returns pointers to the symbols in given slab and bundles slab lifetime with
+// returned symbol pointers so that the pointers are never invalid.
+std::shared_ptr>
+getSymbolsFromSlab(SymbolSlab Slab);
+
 } // namespace clangd
 } // namespace clang
 
Index: clang-tools-extra/clangd/index/MemIndex.cpp
===
--- clang-tools-extra/clangd/index/MemIndex.cpp
+++ clang-tools-extra/clangd/index/MemIndex.cpp
@@ -28,6 +28,12 @@
   }
 }
 
+std::unique_ptr MemIndex::build(SymbolSlab Slab) {
+  auto Idx = llvm::make_unique();
+  Idx->build(getSymbolsFromSlab(std::move(Slab)));
+  return std::move(Idx);
+}
+
 bool MemIndex::fuzzyFind(
 const FuzzyFindRequest &Req,
 llvm::function_ref Callback) const {
@@ -72,26 +78,24 @@
   }
 }
 
-std::unique_ptr MemIndex::build(SymbolSlab Slab) {
+void MemIndex::findOccurrences(
+const OccurrencesRequest &Req,
+llvm::function_ref Callback) const {
+  log("findOccurrences is not implemented.");
+}
+
+std::shared_ptr>
+getSymbolsFromSlab(SymbolSlab Slab) {
   struct Snapshot {
 SymbolSlab Slab;
 std::vector Pointers;
   };
   auto Snap = std::make_shared();
   Snap->Slab = std::move(Slab);
   for (auto &Sym : Snap->Slab)
 Snap->Pointers.push_back(&Sym);
-  auto S = std::shared_ptr>(std::move(Snap),
-&Snap->Pointers);
-  auto MemIdx = llvm::make_uni

[PATCH] D50897: [clangd] Allow using experimental Dex index

2018-08-21 Thread Kirill Bobyrev via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL340262: [clangd] Allow using experimental Dex index 
(authored by omtcyfz, committed by ).
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D50897?vs=161679&id=161682#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D50897

Files:
  clang-tools-extra/trunk/clangd/index/MemIndex.cpp
  clang-tools-extra/trunk/clangd/index/MemIndex.h
  clang-tools-extra/trunk/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/trunk/clangd/index/dex/DexIndex.h
  clang-tools-extra/trunk/clangd/tool/ClangdMain.cpp

Index: clang-tools-extra/trunk/clangd/tool/ClangdMain.cpp
===
--- clang-tools-extra/trunk/clangd/tool/ClangdMain.cpp
+++ clang-tools-extra/trunk/clangd/tool/ClangdMain.cpp
@@ -12,6 +12,7 @@
 #include "Path.h"
 #include "Trace.h"
 #include "index/SymbolYAML.h"
+#include "index/dex/DexIndex.h"
 #include "clang/Basic/Version.h"
 #include "llvm/Support/CommandLine.h"
 #include "llvm/Support/FileSystem.h"
@@ -29,6 +30,7 @@
 using namespace clang::clangd;
 
 namespace {
+
 enum class PCHStorageFlag { Disk, Memory };
 
 // Build an in-memory static index for global symbols from a YAML-format file.
@@ -45,8 +47,10 @@
   for (auto Sym : Slab)
 SymsBuilder.insert(Sym);
 
-  return MemIndex::build(std::move(SymsBuilder).build());
+  return UseDex ? DexIndex::build(std::move(SymsBuilder).build())
+: MemIndex::build(std::move(SymsBuilder).build());
 }
+
 } // namespace
 
 static llvm::cl::opt CompileCommandsDir(
@@ -185,6 +189,11 @@
 "'compile_commands.json' files")),
 llvm::cl::init(FilesystemCompileArgs), llvm::cl::Hidden);
 
+static llvm::cl::opt
+UseDex("use-dex-index",
+   llvm::cl::desc("Use experimental Dex static index."),
+   llvm::cl::init(false), llvm::cl::Hidden);
+
 int main(int argc, char *argv[]) {
   llvm::sys::PrintStackTraceOnErrorSignal(argv[0]);
   llvm::cl::SetVersionPrinter([](llvm::raw_ostream &OS) {
Index: clang-tools-extra/trunk/clangd/index/MemIndex.h
===
--- clang-tools-extra/trunk/clangd/index/MemIndex.h
+++ clang-tools-extra/trunk/clangd/index/MemIndex.h
@@ -47,6 +47,11 @@
   mutable std::mutex Mutex;
 };
 
+// Returns pointers to the symbols in given slab and bundles slab lifetime with
+// returned symbol pointers so that the pointers are never invalid.
+std::shared_ptr>
+getSymbolsFromSlab(SymbolSlab Slab);
+
 } // namespace clangd
 } // namespace clang
 
Index: clang-tools-extra/trunk/clangd/index/MemIndex.cpp
===
--- clang-tools-extra/trunk/clangd/index/MemIndex.cpp
+++ clang-tools-extra/trunk/clangd/index/MemIndex.cpp
@@ -28,6 +28,12 @@
   }
 }
 
+std::unique_ptr MemIndex::build(SymbolSlab Slab) {
+  auto Idx = llvm::make_unique();
+  Idx->build(getSymbolsFromSlab(std::move(Slab)));
+  return std::move(Idx);
+}
+
 bool MemIndex::fuzzyFind(
 const FuzzyFindRequest &Req,
 llvm::function_ref Callback) const {
@@ -72,26 +78,24 @@
   }
 }
 
-std::unique_ptr MemIndex::build(SymbolSlab Slab) {
+void MemIndex::findOccurrences(
+const OccurrencesRequest &Req,
+llvm::function_ref Callback) const {
+  log("findOccurrences is not implemented.");
+}
+
+std::shared_ptr>
+getSymbolsFromSlab(SymbolSlab Slab) {
   struct Snapshot {
 SymbolSlab Slab;
 std::vector Pointers;
   };
   auto Snap = std::make_shared();
   Snap->Slab = std::move(Slab);
   for (auto &Sym : Snap->Slab)
 Snap->Pointers.push_back(&Sym);
-  auto S = std::shared_ptr>(std::move(Snap),
-&Snap->Pointers);
-  auto MemIdx = llvm::make_unique();
-  MemIdx->build(std::move(S));
-  return std::move(MemIdx);
-}
-
-void MemIndex::findOccurrences(
-const OccurrencesRequest &Req,
-llvm::function_ref Callback) const {
-  log("findOccurrences is not implemented.");
+  return std::shared_ptr>(std::move(Snap),
+  &Snap->Pointers);
 }
 
 } // namespace clangd
Index: clang-tools-extra/trunk/clangd/index/dex/DexIndex.h
===
--- clang-tools-extra/trunk/clangd/index/dex/DexIndex.h
+++ clang-tools-extra/trunk/clangd/index/dex/DexIndex.h
@@ -43,6 +43,9 @@
   /// accessible as long as `Symbols` is kept alive.
   void build(std::shared_ptr> Symbols);
 
+  /// \brief Build index from a symbol slab.
+  static std::unique_ptr build(SymbolSlab Slab);
+
   bool
   fuzzyFind(const FuzzyFindRequest &Req,
 llvm::function_ref Callback) const override;
Index: clang-tools-extra/trunk/clangd/index/dex/DexIndex.cpp
===
--- clang-tools-extra/trunk/clangd/index/dex/DexIndex.cpp
+++ clang-tools-extra/trunk

[PATCH] D51090: [clangd] Add index benchmarks

2018-08-22 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev created this revision.
kbobyrev added reviewers: ioeric, ilya-biryukov, sammccall.
kbobyrev added a project: clang-tools-extra.
Herald added subscribers: kadircet, arphaman, jkorous, MaskRay, mgorny.

This patch introduces index benchmarks on top of the proposed LLVM benchmark 
pull.

The preliminary results are here (benchmark invocation is given LLVM + Clang + 
Clang-Tools-Extra + libc++ symbol database):

2018-08-22 11:12:34
Run on (72 X 3700 MHz CPU s)
CPU Caches:

  L1 Data 32K (x36)
  L1 Instruction 32K (x36)
  L2 Unified 1024K (x36)
  L3 Unified 25344K (x2)

---

BenchmarkTime   CPU Iterations
--

BuildMem6042296009 ns 6041974540 ns  1
MemAdHocQueries  117140353 ns  117134140 ns  6
BuildDex8902015670 ns 8901575912 ns  1
DexAdHocQueries5137288 ns5137076 ns132

Dex is about 22 times faster compared to MemIndex despite actually processing 
100x more items since `ItemsToRetrieve = 100 * Req.MaxCandidateCount`. When 
setting `ItemsToRetrieve = Req.MaxCandidateCount`, it is up to 1000x faster.


https://reviews.llvm.org/D51090

Files:
  clang-tools-extra/clangd/CMakeLists.txt
  clang-tools-extra/clangd/benchmarks/CMakeLists.txt
  clang-tools-extra/clangd/benchmarks/IndexBenchmark.cpp
  clang-tools-extra/clangd/index/SymbolYAML.cpp
  clang-tools-extra/clangd/index/SymbolYAML.h
  clang-tools-extra/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/clangd/tool/ClangdMain.cpp

Index: clang-tools-extra/clangd/tool/ClangdMain.cpp
===
--- clang-tools-extra/clangd/tool/ClangdMain.cpp
+++ clang-tools-extra/clangd/tool/ClangdMain.cpp
@@ -38,24 +38,6 @@
 
 enum class PCHStorageFlag { Disk, Memory };
 
-// Build an in-memory static index for global symbols from a YAML-format file.
-// The size of global symbols should be relatively small, so that all symbols
-// can be managed in memory.
-std::unique_ptr buildStaticIndex(llvm::StringRef YamlSymbolFile) {
-  auto Buffer = llvm::MemoryBuffer::getFile(YamlSymbolFile);
-  if (!Buffer) {
-llvm::errs() << "Can't open " << YamlSymbolFile << "\n";
-return nullptr;
-  }
-  auto Slab = symbolsFromYAML(Buffer.get()->getBuffer());
-  SymbolSlab::Builder SymsBuilder;
-  for (auto Sym : Slab)
-SymsBuilder.insert(Sym);
-
-  return UseDex ? dex::DexIndex::build(std::move(SymsBuilder).build())
-: MemIndex::build(std::move(SymsBuilder).build());
-}
-
 } // namespace
 
 static llvm::cl::opt CompileCommandsDir(
@@ -294,7 +276,7 @@
   Opts.BuildDynamicSymbolIndex = EnableIndex;
   std::unique_ptr StaticIdx;
   if (EnableIndex && !YamlSymbolFile.empty()) {
-StaticIdx = buildStaticIndex(YamlSymbolFile);
+StaticIdx = buildStaticIndex(YamlSymbolFile, UseDex);
 Opts.StaticIndex = StaticIdx.get();
   }
   Opts.AsyncThreadsCount = WorkerThreadsCount;
Index: clang-tools-extra/clangd/index/dex/DexIndex.cpp
===
--- clang-tools-extra/clangd/index/dex/DexIndex.cpp
+++ clang-tools-extra/clangd/index/dex/DexIndex.cpp
@@ -161,7 +161,6 @@
   }
 }
 
-
 void DexIndex::findOccurrences(
 const OccurrencesRequest &Req,
 llvm::function_ref Callback) const {
Index: clang-tools-extra/clangd/index/SymbolYAML.h
===
--- clang-tools-extra/clangd/index/SymbolYAML.h
+++ clang-tools-extra/clangd/index/SymbolYAML.h
@@ -42,6 +42,12 @@
 // The YAML result is safe to concatenate if you have multiple symbol slabs.
 void SymbolsToYAML(const SymbolSlab &Symbols, llvm::raw_ostream &OS);
 
+// Build an in-memory static index for global symbols from a YAML-format file.
+// The size of global symbols should be relatively small, so that all symbols
+// can be managed in memory.
+std::unique_ptr buildStaticIndex(llvm::StringRef YamlSymbolFile,
+  bool UseDex);
+
 } // namespace clangd
 } // namespace clang
 
Index: clang-tools-extra/clangd/index/SymbolYAML.cpp
===
--- clang-tools-extra/clangd/index/SymbolYAML.cpp
+++ clang-tools-extra/clangd/index/SymbolYAML.cpp
@@ -9,6 +9,7 @@
 
 #include "SymbolYAML.h"
 #include "Index.h"
+#include "dex/DexIndex.h"
 #include "llvm/ADT/Optional.h"
 #include "llvm/Support/Errc.h"
 #include "llvm/Support/MemoryBuffer.h"
@@ -203,5 +204,21 @@
   return OS.str();
 }
 
+std::unique_ptr buildStaticIndex(llvm::StringRef YamlSymbolFile,
+  bool UseDex) {
+  auto Buffer = llvm::MemoryBuffer::getFile(YamlSymbolFile);
+  if (!Buffer) {
+llvm::errs() << "Can't open " << YamlSymbolFile << "\n";
+return nullptr;
+  }
+  auto Slab = symbolsFromYAML(Buffer.get()->getBuffer());
+  SymbolSlab::Builder SymsBuilder;
+  for (auto Sym : S

[PATCH] D51090: [clangd] Add index benchmarks

2018-08-22 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev planned changes to this revision.
kbobyrev added a comment.

The current diff is rather messy and it is also blocked by the parent revision 
(https://reviews.llvm.org/D50894). It is likely to change if the parent CMake 
structure is changed.


https://reviews.llvm.org/D51090



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D50970: [clangd] Implement BOOST iterator

2018-08-22 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 161909.
kbobyrev marked 4 inline comments as done.
kbobyrev added a subscriber: sammccall.
kbobyrev added a comment.

- Add more comments explaining the difference between `consume()` and 
`consumeAndBoost()` and their potential usecases for the clients
- Move `DEFAULT_BOOSTING_SCORE` to the header by making it `constexpr` and 
change its type to `float`


https://reviews.llvm.org/D50970

Files:
  clang-tools-extra/clangd/index/dex/Iterator.cpp
  clang-tools-extra/clangd/index/dex/Iterator.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp

Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -178,53 +178,61 @@
 // FIXME(kbobyrev): The testcase below is similar to what is expected in real
 // queries. It should be updated once new iterators (such as boosting, limiting,
 // etc iterators) appear. However, it is not exhaustive and it would be
-// beneficial to implement automatic generation of query trees for more
-// comprehensive testing.
+// beneficial to implement automatic generation (e.g. fuzzing) of query trees
+// for more comprehensive testing.
 TEST(DexIndexIterators, QueryTree) {
-  // An example of more complicated query
   //
   //  +-+
   //  |And Iterator:1, 5|
   //  +++
   //   |
   //   |
-  // ++
+  // +-+--+
   // ||
   // ||
-  //  +--v--+  +--v-+
-  //  |And Iterator: 1, 5, 9|  |Or Iterator: 0, 1, 5|
-  //  +--+--+  +--+-+
+  //  +--v--+  +--v+
+  //  |And Iterator: 1, 5, 9|  |Or Iterator: 0, 1, 3, 5|
+  //  +--+--+  +--++
   // ||
-  //  +--+-++-+---+
+  //  +--+-++-+
   //  ||| |   |
-  //  +---v-+ +v-+   +--v--++-V--++---v---+
-  //  |1, 3, 5, 8, 9| |1, 5, 7, 9|   |Empty||0, 5||0, 1, 5|
-  //  +-+ +--+   +-++++---+
-
+  //  +---v-+ ++---+ +--v--+  +---v+ +v---+
+  //  |1, 3, 5, 8, 9| |Boost: 2| |Empty|  |Boost: 3| |Boost: 4|
+  //  +-+ ++---+ +-+  +---++ ++---+
+  //   |  |   |
+  //  +v-+  +-v--++---v---+
+  //  |1, 5, 7, 9|  |1, 5||0, 3, 5|
+  //  +--+  +++---+
+  //
   const PostingList L0 = {1, 3, 5, 8, 9};
   const PostingList L1 = {1, 5, 7, 9};
-  const PostingList L2 = {0, 5};
-  const PostingList L3 = {0, 1, 5};
-  const PostingList L4;
+  const PostingList L3;
+  const PostingList L4 = {1, 5};
+  const PostingList L5 = {0, 3, 5};
 
   // Root of the query tree: [1, 5]
   auto Root = createAnd(
   // Lower And Iterator: [1, 5, 9]
-  createAnd(create(L0), create(L1)),
+  createAnd(create(L0), createBoost(create(L1), 2U)),
   // Lower Or Iterator: [0, 1, 5]
-  createOr(create(L2), create(L3), create(L4)));
+  createOr(create(L3), createBoost(create(L4), 3U),
+   createBoost(create(L5), 4U)));
 
   EXPECT_FALSE(Root->reachedEnd());
   EXPECT_EQ(Root->peek(), 1U);
   Root->advanceTo(0);
   // Advance multiple times. Shouldn't do anything.
   Root->advanceTo(1);
   Root->advanceTo(0);
   EXPECT_EQ(Root->peek(), 1U);
+  auto ElementBoost = Root->boost(Root->peek());
+  EXPECT_THAT(ElementBoost, 6);
   Root->advance();
   EXPECT_EQ(Root->peek(), 5U);
   Root->advanceTo(5);
   EXPECT_EQ(Root->peek(), 5U);
+  ElementBoost = Root->boost(Root->peek());
+  EXPECT_THAT(ElementBoost, 8);
   Root->advanceTo(9000);
   EXPECT_TRUE(Root->reachedEnd());
 }
@@ -281,6 +289,34 @@
   EXPECT_THAT(consume(*AndIterator), ElementsAre(1, 2, 5));
 }
 
+TEST(DexIndexIterators, Boost) {
+  auto BoostIterator = createBoost(createTrue(5U), 42U);
+  EXPECT_FALSE(BoostIterator->reachedEnd());
+  auto ElementBoost = BoostIterator->boost(BoostIterator->peek());
+  EXPECT_THAT(ElementBoost, 42U);
+
+  const PostingList L0 = {2, 4};
+  const PostingList L1 = {1, 4};
+  auto Root = createOr

[PATCH] D50970: [clangd] Implement BOOST iterator

2018-08-22 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 161936.
kbobyrev marked 2 inline comments as done.
kbobyrev added a comment.

Address remaining comments.


https://reviews.llvm.org/D50970

Files:
  clang-tools-extra/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/clangd/index/dex/Iterator.cpp
  clang-tools-extra/clangd/index/dex/Iterator.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp

Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -29,6 +29,15 @@
 namespace dex {
 namespace {
 
+std::vector
+consumeIDs(Iterator &It, size_t Limit = std::numeric_limits::max()) {
+  auto IDAndScore = consume(It, Limit);
+  std::vector IDs(IDAndScore.size());
+  for (size_t I = 0; I < IDAndScore.size(); ++I)
+IDs[I] = IDAndScore[I].first;
+  return IDs;
+}
+
 TEST(DexIndexIterators, DocumentIterator) {
   const PostingList L = {4, 7, 8, 20, 42, 100};
   auto DocIterator = create(L);
@@ -62,7 +71,7 @@
   auto AndWithEmpty = createAnd(create(L0), create(L1));
   EXPECT_TRUE(AndWithEmpty->reachedEnd());
 
-  EXPECT_THAT(consume(*AndWithEmpty), ElementsAre());
+  EXPECT_THAT(consumeIDs(*AndWithEmpty), ElementsAre());
 }
 
 TEST(DexIndexIterators, AndTwoLists) {
@@ -72,7 +81,7 @@
   auto And = createAnd(create(L1), create(L0));
 
   EXPECT_FALSE(And->reachedEnd());
-  EXPECT_THAT(consume(*And), ElementsAre(0U, 7U, 10U, 320U, 9000U));
+  EXPECT_THAT(consumeIDs(*And), ElementsAre(0U, 7U, 10U, 320U, 9000U));
 
   And = createAnd(create(L0), create(L1));
 
@@ -113,7 +122,7 @@
   auto OrWithEmpty = createOr(create(L0), create(L1));
   EXPECT_FALSE(OrWithEmpty->reachedEnd());
 
-  EXPECT_THAT(consume(*OrWithEmpty),
+  EXPECT_THAT(consumeIDs(*OrWithEmpty),
   ElementsAre(0U, 5U, 7U, 10U, 42U, 320U, 9000U));
 }
 
@@ -146,7 +155,7 @@
 
   Or = createOr(create(L0), create(L1));
 
-  EXPECT_THAT(consume(*Or),
+  EXPECT_THAT(consumeIDs(*Or),
   ElementsAre(0U, 4U, 5U, 7U, 10U, 30U, 42U, 60U, 320U, 9000U));
 }
 
@@ -178,53 +187,61 @@
 // FIXME(kbobyrev): The testcase below is similar to what is expected in real
 // queries. It should be updated once new iterators (such as boosting, limiting,
 // etc iterators) appear. However, it is not exhaustive and it would be
-// beneficial to implement automatic generation of query trees for more
-// comprehensive testing.
+// beneficial to implement automatic generation (e.g. fuzzing) of query trees
+// for more comprehensive testing.
 TEST(DexIndexIterators, QueryTree) {
-  // An example of more complicated query
   //
   //  +-+
   //  |And Iterator:1, 5|
   //  +++
   //   |
   //   |
-  // ++
+  // +-+--+
   // ||
   // ||
-  //  +--v--+  +--v-+
-  //  |And Iterator: 1, 5, 9|  |Or Iterator: 0, 1, 5|
-  //  +--+--+  +--+-+
+  //  +--v--+  +--v+
+  //  |And Iterator: 1, 5, 9|  |Or Iterator: 0, 1, 3, 5|
+  //  +--+--+  +--++
   // ||
-  //  +--+-++-+---+
+  //  +--+-++-+
   //  ||| |   |
-  //  +---v-+ +v-+   +--v--++-V--++---v---+
-  //  |1, 3, 5, 8, 9| |1, 5, 7, 9|   |Empty||0, 5||0, 1, 5|
-  //  +-+ +--+   +-++++---+
-
+  //  +---v-+ ++---+ +--v--+  +---v+ +v---+
+  //  |1, 3, 5, 8, 9| |Boost: 2| |Empty|  |Boost: 3| |Boost: 4|
+  //  +-+ ++---+ +-+  +---++ ++---+
+  //   |  |   |
+  //  +v-+  +-v--++---v---+
+  //  |1, 5, 7, 9|  |1, 5||0, 3, 5|
+  //  +--+  +++---+
+  //
   const PostingList L0 = {1, 3, 5, 8, 9};
   const PostingList L1 = {1, 5, 7, 9};
-  const PostingList L2 = {0, 5};
-  const PostingList L3 = {0, 1, 5};
-  const PostingList L4;
+  const PostingList L3;
+  const PostingList L4 = {1, 5};
+  const PostingList L5 = {0, 3, 5};
 
   // Root of the query tree: [1, 5]
   auto Root = createAnd(
   // Lower

[PATCH] D50970: [clangd] Implement BOOST iterator

2018-08-22 Thread Kirill Bobyrev via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rCTE340409: [clangd] Implement BOOST iterator (authored by 
omtcyfz, committed by ).

Changed prior to commit:
  https://reviews.llvm.org/D50970?vs=161936&id=161944#toc

Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D50970

Files:
  clangd/index/dex/DexIndex.cpp
  clangd/index/dex/Iterator.cpp
  clangd/index/dex/Iterator.h
  unittests/clangd/DexIndexTests.cpp

Index: unittests/clangd/DexIndexTests.cpp
===
--- unittests/clangd/DexIndexTests.cpp
+++ unittests/clangd/DexIndexTests.cpp
@@ -29,6 +29,15 @@
 namespace dex {
 namespace {
 
+std::vector
+consumeIDs(Iterator &It, size_t Limit = std::numeric_limits::max()) {
+  auto IDAndScore = consume(It, Limit);
+  std::vector IDs(IDAndScore.size());
+  for (size_t I = 0; I < IDAndScore.size(); ++I)
+IDs[I] = IDAndScore[I].first;
+  return IDs;
+}
+
 TEST(DexIndexIterators, DocumentIterator) {
   const PostingList L = {4, 7, 8, 20, 42, 100};
   auto DocIterator = create(L);
@@ -62,7 +71,7 @@
   auto AndWithEmpty = createAnd(create(L0), create(L1));
   EXPECT_TRUE(AndWithEmpty->reachedEnd());
 
-  EXPECT_THAT(consume(*AndWithEmpty), ElementsAre());
+  EXPECT_THAT(consumeIDs(*AndWithEmpty), ElementsAre());
 }
 
 TEST(DexIndexIterators, AndTwoLists) {
@@ -72,7 +81,7 @@
   auto And = createAnd(create(L1), create(L0));
 
   EXPECT_FALSE(And->reachedEnd());
-  EXPECT_THAT(consume(*And), ElementsAre(0U, 7U, 10U, 320U, 9000U));
+  EXPECT_THAT(consumeIDs(*And), ElementsAre(0U, 7U, 10U, 320U, 9000U));
 
   And = createAnd(create(L0), create(L1));
 
@@ -113,7 +122,7 @@
   auto OrWithEmpty = createOr(create(L0), create(L1));
   EXPECT_FALSE(OrWithEmpty->reachedEnd());
 
-  EXPECT_THAT(consume(*OrWithEmpty),
+  EXPECT_THAT(consumeIDs(*OrWithEmpty),
   ElementsAre(0U, 5U, 7U, 10U, 42U, 320U, 9000U));
 }
 
@@ -146,7 +155,7 @@
 
   Or = createOr(create(L0), create(L1));
 
-  EXPECT_THAT(consume(*Or),
+  EXPECT_THAT(consumeIDs(*Or),
   ElementsAre(0U, 4U, 5U, 7U, 10U, 30U, 42U, 60U, 320U, 9000U));
 }
 
@@ -178,53 +187,61 @@
 // FIXME(kbobyrev): The testcase below is similar to what is expected in real
 // queries. It should be updated once new iterators (such as boosting, limiting,
 // etc iterators) appear. However, it is not exhaustive and it would be
-// beneficial to implement automatic generation of query trees for more
-// comprehensive testing.
+// beneficial to implement automatic generation (e.g. fuzzing) of query trees
+// for more comprehensive testing.
 TEST(DexIndexIterators, QueryTree) {
-  // An example of more complicated query
   //
   //  +-+
   //  |And Iterator:1, 5|
   //  +++
   //   |
   //   |
-  // ++
+  // +-+--+
   // ||
   // ||
-  //  +--v--+  +--v-+
-  //  |And Iterator: 1, 5, 9|  |Or Iterator: 0, 1, 5|
-  //  +--+--+  +--+-+
+  //  +--v--+  +--v+
+  //  |And Iterator: 1, 5, 9|  |Or Iterator: 0, 1, 3, 5|
+  //  +--+--+  +--++
   // ||
-  //  +--+-++-+---+
+  //  +--+-++-+
   //  ||| |   |
-  //  +---v-+ +v-+   +--v--++-V--++---v---+
-  //  |1, 3, 5, 8, 9| |1, 5, 7, 9|   |Empty||0, 5||0, 1, 5|
-  //  +-+ +--+   +-++++---+
-
+  //  +---v-+ ++---+ +--v--+  +---v+ +v---+
+  //  |1, 3, 5, 8, 9| |Boost: 2| |Empty|  |Boost: 3| |Boost: 4|
+  //  +-+ ++---+ +-+  +---++ ++---+
+  //   |  |   |
+  //  +v-+  +-v--++---v---+
+  //  |1, 5, 7, 9|  |1, 5||0, 3, 5|
+  //  +--+  +++---+
+  //
   const PostingList L0 = {1, 3, 5, 8, 9};
   const PostingList L1 = {1, 5, 7, 9};
-  const PostingList L2 = {0, 5};
-  const PostingList L3 = {0, 1, 5};
-  const PostingList L4;
+  const PostingList L3;
+  const PostingList L4 = {1, 5};
+  const PostingList L5 = {0, 3, 5};
 
   // Root of the query tree: [1, 5]
   auto Root = 

[PATCH] D51029: [clangd] Implement LIMIT iterator

2018-08-22 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 161947.
kbobyrev edited the summary of this revision.
kbobyrev added a reviewer: sammccall.
kbobyrev added a comment.

Rebase on top of BOOST iterator patch, update tests.


https://reviews.llvm.org/D51029

Files:
  clang-tools-extra/clangd/index/dex/Iterator.cpp
  clang-tools-extra/clangd/index/dex/Iterator.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp

Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -265,24 +265,27 @@
 }
 
 TEST(DexIndexIterators, Limit) {
-  const PostingList L0 = {4, 7, 8, 20, 42, 100};
-  const PostingList L1 = {1, 3, 5, 8, 9};
-  const PostingList L2 = {1, 5, 7, 9};
-  const PostingList L3 = {0, 5};
-  const PostingList L4 = {0, 1, 5};
-  const PostingList L5;
+  const PostingList L0 = {3, 6, 7, 20, 42, 100};
+  const PostingList L1 = {1, 3, 5, 6, 7, 30, 100};
+  const PostingList L2 = {0, 3, 5, 7, 8, 100};
 
   auto DocIterator = create(L0);
-  EXPECT_THAT(consumeIDs(*DocIterator, 42), ElementsAre(4, 7, 8, 20, 42, 100));
+  EXPECT_THAT(consumeIDs(*DocIterator, 42),
+  ElementsAre(3, 6, 7, 20, 42, 100));
 
   DocIterator = create(L0);
-  EXPECT_THAT(consumeIDs(*DocIterator), ElementsAre(4, 7, 8, 20, 42, 100));
+  EXPECT_THAT(consumeIDs(*DocIterator), ElementsAre(3, 6, 7, 20, 42, 100));
 
   DocIterator = create(L0);
-  EXPECT_THAT(consumeIDs(*DocIterator, 3), ElementsAre(4, 7, 8));
+  EXPECT_THAT(consumeIDs(*DocIterator, 3), ElementsAre(3, 6, 7));
 
   DocIterator = create(L0);
   EXPECT_THAT(consumeIDs(*DocIterator, 0), ElementsAre());
+
+  auto AndIterator =
+  createAnd(createLimit(createTrue(9000), 343), createLimit(create(L0), 2),
+createLimit(create(L1), 3), createLimit(create(L2), 42));
+  EXPECT_THAT(consumeIDs(*AndIterator), ElementsAre(3, 7));
 }
 
 TEST(DexIndexIterators, True) {
Index: clang-tools-extra/clangd/index/dex/Iterator.h
===
--- clang-tools-extra/clangd/index/dex/Iterator.h
+++ clang-tools-extra/clangd/index/dex/Iterator.h
@@ -164,6 +164,10 @@
 /// trim Top K using retrieval score.
 std::unique_ptr createBoost(std::unique_ptr Child,
   float Factor);
+/// Returns LIMIT iterator, which is exhausted as soon requested number of items
+/// is consumed from the root of query tree.
+std::unique_ptr createLimit(std::unique_ptr Child,
+  size_t Limit);
 
 /// This allows createAnd(create(...), create(...)) syntax.
 template  std::unique_ptr createAnd(Args... args) {
Index: clang-tools-extra/clangd/index/dex/Iterator.cpp
===
--- clang-tools-extra/clangd/index/dex/Iterator.cpp
+++ clang-tools-extra/clangd/index/dex/Iterator.cpp
@@ -318,6 +318,41 @@
   float Factor;
 };
 
+/// This iterator is a wrapper around the underlying child which limits the
+/// number of elements which are retrieved from the whole iterator tree.
+class LimitIterator : public Iterator {
+public:
+  LimitIterator(std::unique_ptr Child, size_t Limit)
+  : Child(move(Child)), ItemsLeft(Limit) {}
+
+  bool reachedEnd() const override {
+return ItemsLeft == 0 || Child->reachedEnd();
+  }
+
+  void advance() override { Child->advance(); }
+
+  void advanceTo(DocID ID) override { Child->advanceTo(ID); }
+
+  DocID peek() const override { return Child->peek(); }
+
+  /// Decreases the limit in case the element consumed at top of the query tree
+  /// comes from the underlying iterator.
+  float consume(DocID ID) override {
+if (!reachedEnd() && peek() == ID)
+  --ItemsLeft;
+return Child->consume(ID);
+  }
+
+private:
+  llvm::raw_ostream &dump(llvm::raw_ostream &OS) const override {
+OS << "(LIMIT items left " << ItemsLeft << *Child << ")";
+return OS;
+  }
+
+  std::unique_ptr Child;
+  size_t ItemsLeft;
+};
+
 } // end namespace
 
 std::vector> consume(Iterator &It, size_t Limit) {
@@ -353,6 +388,11 @@
   return llvm::make_unique(move(Child), Factor);
 }
 
+std::unique_ptr createLimit(std::unique_ptr Child,
+  size_t Size) {
+  return llvm::make_unique(move(Child), Size);
+}
+
 } // namespace dex
 } // namespace clangd
 } // namespace clang
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D51029: [clangd] Implement LIMIT iterator

2018-08-22 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 161952.
kbobyrev added a comment.

Use better wording for internal `LimitIterator` documentation, perform minor 
code cleanup.


https://reviews.llvm.org/D51029

Files:
  clang-tools-extra/clangd/index/dex/Iterator.cpp
  clang-tools-extra/clangd/index/dex/Iterator.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp

Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -265,24 +265,27 @@
 }
 
 TEST(DexIndexIterators, Limit) {
-  const PostingList L0 = {4, 7, 8, 20, 42, 100};
-  const PostingList L1 = {1, 3, 5, 8, 9};
-  const PostingList L2 = {1, 5, 7, 9};
-  const PostingList L3 = {0, 5};
-  const PostingList L4 = {0, 1, 5};
-  const PostingList L5;
+  const PostingList L0 = {3, 6, 7, 20, 42, 100};
+  const PostingList L1 = {1, 3, 5, 6, 7, 30, 100};
+  const PostingList L2 = {0, 3, 5, 7, 8, 100};
 
   auto DocIterator = create(L0);
-  EXPECT_THAT(consumeIDs(*DocIterator, 42), ElementsAre(4, 7, 8, 20, 42, 100));
+  EXPECT_THAT(consumeIDs(*DocIterator, 42),
+  ElementsAre(3, 6, 7, 20, 42, 100));
 
   DocIterator = create(L0);
-  EXPECT_THAT(consumeIDs(*DocIterator), ElementsAre(4, 7, 8, 20, 42, 100));
+  EXPECT_THAT(consumeIDs(*DocIterator), ElementsAre(3, 6, 7, 20, 42, 100));
 
   DocIterator = create(L0);
-  EXPECT_THAT(consumeIDs(*DocIterator, 3), ElementsAre(4, 7, 8));
+  EXPECT_THAT(consumeIDs(*DocIterator, 3), ElementsAre(3, 6, 7));
 
   DocIterator = create(L0);
   EXPECT_THAT(consumeIDs(*DocIterator, 0), ElementsAre());
+
+  auto AndIterator =
+  createAnd(createLimit(createTrue(9000), 343), createLimit(create(L0), 2),
+createLimit(create(L1), 3), createLimit(create(L2), 42));
+  EXPECT_THAT(consumeIDs(*AndIterator), ElementsAre(3, 7));
 }
 
 TEST(DexIndexIterators, True) {
Index: clang-tools-extra/clangd/index/dex/Iterator.h
===
--- clang-tools-extra/clangd/index/dex/Iterator.h
+++ clang-tools-extra/clangd/index/dex/Iterator.h
@@ -165,6 +165,11 @@
 std::unique_ptr createBoost(std::unique_ptr Child,
   float Factor);
 
+/// Returns LIMIT iterator, which is exhausted as soon requested number of items
+/// is consumed from the root of query tree.
+std::unique_ptr createLimit(std::unique_ptr Child,
+  size_t Limit);
+
 /// This allows createAnd(create(...), create(...)) syntax.
 template  std::unique_ptr createAnd(Args... args) {
   std::vector> Children;
Index: clang-tools-extra/clangd/index/dex/Iterator.cpp
===
--- clang-tools-extra/clangd/index/dex/Iterator.cpp
+++ clang-tools-extra/clangd/index/dex/Iterator.cpp
@@ -318,6 +318,43 @@
   float Factor;
 };
 
+/// This iterator limits the number of items retrieved from the child iterator
+/// on top of the query tree. To ensure that query tree with LIMIT iterators
+/// inside works correctly, users have to call Root->consume(Root->peek()) each
+/// time item is retrieved at the root of query tree.
+class LimitIterator : public Iterator {
+public:
+  LimitIterator(std::unique_ptr Child, size_t Limit)
+  : Child(move(Child)), ItemsLeft(Limit) {}
+
+  bool reachedEnd() const override {
+return ItemsLeft == 0 || Child->reachedEnd();
+  }
+
+  void advance() override { Child->advance(); }
+
+  void advanceTo(DocID ID) override { Child->advanceTo(ID); }
+
+  DocID peek() const override { return Child->peek(); }
+
+  /// Decreases the limit in case the element consumed at top of the query tree
+  /// comes from the underlying iterator.
+  float consume(DocID ID) override {
+if (!reachedEnd() && peek() == ID)
+  --ItemsLeft;
+return Child->consume(ID);
+  }
+
+private:
+  llvm::raw_ostream &dump(llvm::raw_ostream &OS) const override {
+OS << "(LIMIT items left " << ItemsLeft << *Child << ")";
+return OS;
+  }
+
+  std::unique_ptr Child;
+  size_t ItemsLeft;
+};
+
 } // end namespace
 
 std::vector> consume(Iterator &It, size_t Limit) {
@@ -353,6 +390,11 @@
   return llvm::make_unique(move(Child), Factor);
 }
 
+std::unique_ptr createLimit(std::unique_ptr Child,
+  size_t Size) {
+  return llvm::make_unique(move(Child), Size);
+}
+
 } // namespace dex
 } // namespace clangd
 } // namespace clang
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D51154: [clangd] Log memory usage of DexIndex and MemIndex

2018-08-23 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev created this revision.
kbobyrev added reviewers: ioeric, ilya-biryukov, sammccall.
kbobyrev added a project: clang-tools-extra.
Herald added subscribers: kadircet, arphaman, jkorous, MaskRay.

This patch prints information about built index size estimation to verbose 
logs. This is useful for optimizing memory usage of DexIndex and comparisons 
with MemIndex.


https://reviews.llvm.org/D51154

Files:
  clang-tools-extra/clangd/index/MemIndex.cpp
  clang-tools-extra/clangd/index/MemIndex.h
  clang-tools-extra/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/clangd/index/dex/DexIndex.h


Index: clang-tools-extra/clangd/index/dex/DexIndex.h
===
--- clang-tools-extra/clangd/index/dex/DexIndex.h
+++ clang-tools-extra/clangd/index/dex/DexIndex.h
@@ -58,6 +58,9 @@
Callback) const override;
 
 private:
+  /// Returns estimate size of the index in megabytes.
+  size_t estimateMemoryUsage();
+
   mutable std::mutex Mutex;
 
   std::shared_ptr> Symbols /*GUARDED_BY(Mutex)*/;
Index: clang-tools-extra/clangd/index/dex/DexIndex.cpp
===
--- clang-tools-extra/clangd/index/dex/DexIndex.cpp
+++ clang-tools-extra/clangd/index/dex/DexIndex.cpp
@@ -67,6 +67,9 @@
 InvertedIndex = std::move(TempInvertedIndex);
 SymbolQuality = std::move(TempSymbolQuality);
   }
+
+  vlog("Built DexIndex with estimated memory usage {0} MB.",
+   estimateMemoryUsage());
 }
 
 std::unique_ptr DexIndex::build(SymbolSlab Slab) {
@@ -171,6 +174,20 @@
   log("findOccurrences is not implemented.");
 }
 
+size_t DexIndex::estimateMemoryUsage() {
+  size_t Bytes = LookupTable.size() * sizeof(std::pair);
+  Bytes += SymbolQuality.size() * sizeof(std::pair);
+  Bytes += InvertedIndex.size() * sizeof(Token);
+  {
+std::lock_guard Lock(Mutex);
+
+for (const auto &P : InvertedIndex) {
+  Bytes += P.second.size() * sizeof(DocID);
+}
+  }
+  return Bytes / (1000 * 1000);
+}
+
 } // namespace dex
 } // namespace clangd
 } // namespace clang
Index: clang-tools-extra/clangd/index/MemIndex.h
===
--- clang-tools-extra/clangd/index/MemIndex.h
+++ clang-tools-extra/clangd/index/MemIndex.h
@@ -40,6 +40,9 @@
Callback) const override;
 
 private:
+  /// Returns estimate size of the index in megabytes.
+  size_t estimateMemoryUsage();
+
   std::shared_ptr> Symbols;
   // Index is a set of symbols that are deduplicated by symbol IDs.
   // FIXME: build smarter index structure.
Index: clang-tools-extra/clangd/index/MemIndex.cpp
===
--- clang-tools-extra/clangd/index/MemIndex.cpp
+++ clang-tools-extra/clangd/index/MemIndex.cpp
@@ -26,6 +26,9 @@
 Index = std::move(TempIndex);
 Symbols = std::move(Syms); // Relase old symbols.
   }
+
+  vlog("Built MemIndex with estimated memory usage {0} MB.",
+   estimateMemoryUsage());
 }
 
 std::unique_ptr MemIndex::build(SymbolSlab Slab) {
@@ -98,5 +101,10 @@
   &Snap->Pointers);
 }
 
+size_t MemIndex::estimateMemoryUsage() {
+  size_t Bytes = Index.size() * sizeof(std::pair);
+  return Bytes / (1000 * 1000);
+}
+
 } // namespace clangd
 } // namespace clang


Index: clang-tools-extra/clangd/index/dex/DexIndex.h
===
--- clang-tools-extra/clangd/index/dex/DexIndex.h
+++ clang-tools-extra/clangd/index/dex/DexIndex.h
@@ -58,6 +58,9 @@
Callback) const override;
 
 private:
+  /// Returns estimate size of the index in megabytes.
+  size_t estimateMemoryUsage();
+
   mutable std::mutex Mutex;
 
   std::shared_ptr> Symbols /*GUARDED_BY(Mutex)*/;
Index: clang-tools-extra/clangd/index/dex/DexIndex.cpp
===
--- clang-tools-extra/clangd/index/dex/DexIndex.cpp
+++ clang-tools-extra/clangd/index/dex/DexIndex.cpp
@@ -67,6 +67,9 @@
 InvertedIndex = std::move(TempInvertedIndex);
 SymbolQuality = std::move(TempSymbolQuality);
   }
+
+  vlog("Built DexIndex with estimated memory usage {0} MB.",
+   estimateMemoryUsage());
 }
 
 std::unique_ptr DexIndex::build(SymbolSlab Slab) {
@@ -171,6 +174,20 @@
   log("findOccurrences is not implemented.");
 }
 
+size_t DexIndex::estimateMemoryUsage() {
+  size_t Bytes = LookupTable.size() * sizeof(std::pair);
+  Bytes += SymbolQuality.size() * sizeof(std::pair);
+  Bytes += InvertedIndex.size() * sizeof(Token);
+  {
+std::lock_guard Lock(Mutex);
+
+for (const auto &P : InvertedIndex) {
+  Bytes += P.second.size() * sizeof(DocID);
+}
+  }
+  return Bytes / (1000 * 1000);
+}
+
 } // namespace dex
 } // namespace clangd
 } // namespace clang
Index: clang-tools-extra/clangd/index/MemIndex.h
===

[PATCH] D51154: [clangd] Log memory usage of DexIndex and MemIndex

2018-08-23 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 162146.
kbobyrev marked 6 inline comments as done.
kbobyrev added a comment.

Address a bunch of comments.


https://reviews.llvm.org/D51154

Files:
  clang-tools-extra/clangd/index/FileIndex.cpp
  clang-tools-extra/clangd/index/FileIndex.h
  clang-tools-extra/clangd/index/Index.h
  clang-tools-extra/clangd/index/MemIndex.cpp
  clang-tools-extra/clangd/index/MemIndex.h
  clang-tools-extra/clangd/index/Merge.cpp
  clang-tools-extra/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/clangd/index/dex/DexIndex.h
  clang-tools-extra/unittests/clangd/CodeCompleteTests.cpp

Index: clang-tools-extra/unittests/clangd/CodeCompleteTests.cpp
===
--- clang-tools-extra/unittests/clangd/CodeCompleteTests.cpp
+++ clang-tools-extra/unittests/clangd/CodeCompleteTests.cpp
@@ -923,6 +923,10 @@
llvm::function_ref
Callback) const override {}
 
+  // This is incorrect, but IndexRequestCollector is not an actual index and it
+  // isn't used in production code.
+  size_t estimateMemoryUsage() const override { return 0; }
+
   const std::vector allRequests() const { return Requests; }
 
 private:
Index: clang-tools-extra/clangd/index/dex/DexIndex.h
===
--- clang-tools-extra/clangd/index/dex/DexIndex.h
+++ clang-tools-extra/clangd/index/dex/DexIndex.h
@@ -57,7 +57,10 @@
llvm::function_ref
Callback) const override;
 
+  size_t estimateMemoryUsage() const override;
+
 private:
+
   mutable std::mutex Mutex;
 
   std::shared_ptr> Symbols /*GUARDED_BY(Mutex)*/;
Index: clang-tools-extra/clangd/index/dex/DexIndex.cpp
===
--- clang-tools-extra/clangd/index/dex/DexIndex.cpp
+++ clang-tools-extra/clangd/index/dex/DexIndex.cpp
@@ -67,6 +67,9 @@
 InvertedIndex = std::move(TempInvertedIndex);
 SymbolQuality = std::move(TempSymbolQuality);
   }
+
+  vlog("Built DexIndex with estimated memory usage {0} bytes.",
+   estimateMemoryUsage());
 }
 
 std::unique_ptr DexIndex::build(SymbolSlab Slab) {
@@ -171,6 +174,22 @@
   log("findOccurrences is not implemented.");
 }
 
+size_t DexIndex::estimateMemoryUsage() const {
+  size_t Bytes = 0;
+  {
+std::lock_guard Lock(Mutex);
+
+Bytes += LookupTable.size() * sizeof(std::pair);
+Bytes += SymbolQuality.size() * sizeof(std::pair);
+Bytes += InvertedIndex.size() * sizeof(Token);
+
+for (const auto &P : InvertedIndex) {
+  Bytes += P.second.size() * sizeof(DocID);
+}
+  }
+  return Bytes;
+}
+
 } // namespace dex
 } // namespace clangd
 } // namespace clang
Index: clang-tools-extra/clangd/index/Merge.cpp
===
--- clang-tools-extra/clangd/index/Merge.cpp
+++ clang-tools-extra/clangd/index/Merge.cpp
@@ -84,6 +84,10 @@
 log("findOccurrences is not implemented.");
   }
 
+  size_t estimateMemoryUsage() const override {
+return Dynamic->estimateMemoryUsage() + Static->estimateMemoryUsage();
+  }
+
 private:
   const SymbolIndex *Dynamic, *Static;
 };
Index: clang-tools-extra/clangd/index/MemIndex.h
===
--- clang-tools-extra/clangd/index/MemIndex.h
+++ clang-tools-extra/clangd/index/MemIndex.h
@@ -39,7 +39,10 @@
llvm::function_ref
Callback) const override;
 
+  size_t estimateMemoryUsage() const override;
+
 private:
+
   std::shared_ptr> Symbols;
   // Index is a set of symbols that are deduplicated by symbol IDs.
   // FIXME: build smarter index structure.
Index: clang-tools-extra/clangd/index/MemIndex.cpp
===
--- clang-tools-extra/clangd/index/MemIndex.cpp
+++ clang-tools-extra/clangd/index/MemIndex.cpp
@@ -26,6 +26,9 @@
 Index = std::move(TempIndex);
 Symbols = std::move(Syms); // Relase old symbols.
   }
+
+  vlog("Built MemIndex with estimated memory usage {0} bytes.",
+   estimateMemoryUsage());
 }
 
 std::unique_ptr MemIndex::build(SymbolSlab Slab) {
@@ -98,5 +101,18 @@
   &Snap->Pointers);
 }
 
+size_t MemIndex::estimateMemoryUsage() const {
+  size_t Bytes = 0;
+
+  {
+std::lock_guard Lock(Mutex);
+
+Bytes += Index.getMemorySize();
+return Bytes;
+  }
+
+}
+
+
 } // namespace clangd
 } // namespace clang
Index: clang-tools-extra/clangd/index/Index.h
===
--- clang-tools-extra/clangd/index/Index.h
+++ clang-tools-extra/clangd/index/Index.h
@@ -385,6 +385,12 @@
   virtual void findOccurrences(
   const OccurrencesRequest &Req,
   llvm::function_ref Callback) const = 0;
+
+  /// Returns estimated size of index (in bytes).
+  // FIXME(kbobyrev): Curre

[PATCH] D51154: [clangd] Log memory usage of DexIndex and MemIndex

2018-08-23 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev added inline comments.



Comment at: clang-tools-extra/clangd/index/dex/DexIndex.cpp:180
+  Bytes += SymbolQuality.size() * sizeof(std::pair);
+  Bytes += InvertedIndex.size() * sizeof(Token);
+  {

sammccall wrote:
> I think you're not counting the size of the actual symbols.
> This is difficult to do precisely, but avoiding it seems misleading (we have 
> "shared ownership" but it's basically exclusive). What's the plan here?
As discussed offline: I should put a FIXME and leave it for later.


https://reviews.llvm.org/D51154



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D47471: [clangd] Minor cleanup

2018-05-29 Thread Kirill Bobyrev via Phabricator via cfe-commits
omtcyfz created this revision.
omtcyfz added reviewers: ioeric, ilya-biryukov.
omtcyfz added a project: clang-tools-extra.
Herald added subscribers: jkorous, MaskRay.

This patch silences few clang-tidy warnings, removes unwanted trailing 
whitespace and enforces coding guidelines.

The functionality is not affected since the cleanup is rather straightforward, 
all clangd tests are still green.


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D47471

Files:
  clangd/CodeComplete.cpp
  clangd/tool/ClangdMain.cpp


Index: clangd/tool/ClangdMain.cpp
===
--- clangd/tool/ClangdMain.cpp
+++ clangd/tool/ClangdMain.cpp
@@ -33,7 +33,7 @@
 // Build an in-memory static index for global symbols from a YAML-format file.
 // The size of global symbols should be relatively small, so that all symbols
 // can be managed in memory.
-std::unique_ptr BuildStaticIndex(llvm::StringRef YamlSymbolFile) {
+std::unique_ptr buildStaticIndex(llvm::StringRef YamlSymbolFile) {
   auto Buffer = llvm::MemoryBuffer::getFile(YamlSymbolFile);
   if (!Buffer) {
 llvm::errs() << "Can't open " << YamlSymbolFile << "\n";
@@ -223,7 +223,7 @@
   Opts.BuildDynamicSymbolIndex = EnableIndex;
   std::unique_ptr StaticIdx;
   if (EnableIndex && !YamlSymbolFile.empty()) {
-StaticIdx = BuildStaticIndex(YamlSymbolFile);
+StaticIdx = buildStaticIndex(YamlSymbolFile);
 Opts.StaticIndex = StaticIdx.get();
   }
   Opts.AsyncThreadsCount = WorkerThreadsCount;
Index: clangd/CodeComplete.cpp
===
--- clangd/CodeComplete.cpp
+++ clangd/CodeComplete.cpp
@@ -359,13 +359,13 @@
 
 // Get all scopes that will be queried in indexes.
 std::vector getQueryScopes(CodeCompletionContext &CCContext,
-const SourceManager& SM) {
-  auto GetAllAccessibleScopes = [](CodeCompletionContext& CCContext) {
+const SourceManager &SM) {
+  auto GetAllAccessibleScopes = [](CodeCompletionContext &CCContext) {
 SpecifiedScope Info;
-for (auto* Context : CCContext.getVisitedContexts()) {
+for (auto *Context : CCContext.getVisitedContexts()) {
   if (isa(Context))
 Info.AccessibleScopes.push_back(""); // global namespace
-  else if (const auto*NS = dyn_cast(Context))
+  else if (const auto *NS = dyn_cast(Context))
 Info.AccessibleScopes.push_back(NS->getQualifiedNameAsString() + "::");
 }
 return Info;
@@ -397,8 +397,9 @@
   Info.AccessibleScopes.push_back(""); // global namespace
 
   Info.UnresolvedQualifier =
-  Lexer::getSourceText(CharSourceRange::getCharRange((*SS)->getRange()),
-   SM, clang::LangOptions()).ltrim("::");
+  Lexer::getSourceText(CharSourceRange::getCharRange((*SS)->getRange()), 
SM,
+   clang::LangOptions())
+  .ltrim("::");
   // Sema excludes the trailing "::".
   if (!Info.UnresolvedQualifier->empty())
 *Info.UnresolvedQualifier += "::";
@@ -590,7 +591,7 @@
   SigHelp.signatures.push_back(ProcessOverloadCandidate(
   Candidate, *CCS,
   getParameterDocComment(S.getASTContext(), Candidate, CurrentArg,
- /*CommentsFromHeader=*/false)));
+ /*CommentsFromHeaders=*/false)));
 }
   }
 
@@ -696,7 +697,7 @@
   &DummyDiagsConsumer, false),
   Input.VFS);
   if (!CI) {
-log("Couldn't create CompilerInvocation");;
+log("Couldn't create CompilerInvocation");
 return false;
   }
   auto &FrontendOpts = CI->getFrontendOpts();
@@ -1013,7 +1014,7 @@
 LLVM_DEBUG(llvm::dbgs()
<< "CodeComplete: " << C.Name << (IndexResult ? " (index)" : "")
<< (SemaResult ? " (sema)" : "") << " = " << Scores.finalScore
-   << "\n" 
+   << "\n"
<< Quality << Relevance << "\n");
 
 NSema += bool(SemaResult);
@@ -1035,7 +1036,8 @@
/*CommentsFromHeader=*/false);
   }
 }
-return Candidate.build(FileName, Scores, Opts, SemaCCS, Includes.get(), 
DocComment);
+return Candidate.build(FileName, Scores, Opts, SemaCCS, Includes.get(),
+   DocComment);
   }
 };
 


Index: clangd/tool/ClangdMain.cpp
===
--- clangd/tool/ClangdMain.cpp
+++ clangd/tool/ClangdMain.cpp
@@ -33,7 +33,7 @@
 // Build an in-memory static index for global symbols from a YAML-format file.
 // The size of global symbols should be relatively small, so that all symbols
 // can be managed in memory.
-std::unique_ptr BuildStaticIndex(llvm::StringRef YamlSymbolFile) {
+std::unique_ptr buildStaticIndex(llvm::StringRef YamlSymbolFile) {
   auto Buffer = llvm::MemoryBuffer::getFile(YamlSymbolFile);
   if (!Buffer) {
 llvm::errs() << "Can't ope

[PATCH] D47471: [clangd] Minor cleanup

2018-05-29 Thread Kirill Bobyrev via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL333411: [clangd] Minor cleanup (authored by omtcyfz, 
committed by ).
Herald added subscribers: llvm-commits, klimek.

Changed prior to commit:
  https://reviews.llvm.org/D47471?vs=148847&id=148887#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D47471

Files:
  clang-tools-extra/trunk/clangd/CodeComplete.cpp
  clang-tools-extra/trunk/clangd/tool/ClangdMain.cpp


Index: clang-tools-extra/trunk/clangd/tool/ClangdMain.cpp
===
--- clang-tools-extra/trunk/clangd/tool/ClangdMain.cpp
+++ clang-tools-extra/trunk/clangd/tool/ClangdMain.cpp
@@ -33,7 +33,7 @@
 // Build an in-memory static index for global symbols from a YAML-format file.
 // The size of global symbols should be relatively small, so that all symbols
 // can be managed in memory.
-std::unique_ptr BuildStaticIndex(llvm::StringRef YamlSymbolFile) {
+std::unique_ptr buildStaticIndex(llvm::StringRef YamlSymbolFile) {
   auto Buffer = llvm::MemoryBuffer::getFile(YamlSymbolFile);
   if (!Buffer) {
 llvm::errs() << "Can't open " << YamlSymbolFile << "\n";
@@ -223,7 +223,7 @@
   Opts.BuildDynamicSymbolIndex = EnableIndex;
   std::unique_ptr StaticIdx;
   if (EnableIndex && !YamlSymbolFile.empty()) {
-StaticIdx = BuildStaticIndex(YamlSymbolFile);
+StaticIdx = buildStaticIndex(YamlSymbolFile);
 Opts.StaticIndex = StaticIdx.get();
   }
   Opts.AsyncThreadsCount = WorkerThreadsCount;
Index: clang-tools-extra/trunk/clangd/CodeComplete.cpp
===
--- clang-tools-extra/trunk/clangd/CodeComplete.cpp
+++ clang-tools-extra/trunk/clangd/CodeComplete.cpp
@@ -359,13 +359,13 @@
 
 // Get all scopes that will be queried in indexes.
 std::vector getQueryScopes(CodeCompletionContext &CCContext,
-const SourceManager& SM) {
-  auto GetAllAccessibleScopes = [](CodeCompletionContext& CCContext) {
+const SourceManager &SM) {
+  auto GetAllAccessibleScopes = [](CodeCompletionContext &CCContext) {
 SpecifiedScope Info;
-for (auto* Context : CCContext.getVisitedContexts()) {
+for (auto *Context : CCContext.getVisitedContexts()) {
   if (isa(Context))
 Info.AccessibleScopes.push_back(""); // global namespace
-  else if (const auto*NS = dyn_cast(Context))
+  else if (const auto *NS = dyn_cast(Context))
 Info.AccessibleScopes.push_back(NS->getQualifiedNameAsString() + "::");
 }
 return Info;
@@ -397,8 +397,9 @@
   Info.AccessibleScopes.push_back(""); // global namespace
 
   Info.UnresolvedQualifier =
-  Lexer::getSourceText(CharSourceRange::getCharRange((*SS)->getRange()),
-   SM, clang::LangOptions()).ltrim("::");
+  Lexer::getSourceText(CharSourceRange::getCharRange((*SS)->getRange()), 
SM,
+   clang::LangOptions())
+  .ltrim("::");
   // Sema excludes the trailing "::".
   if (!Info.UnresolvedQualifier->empty())
 *Info.UnresolvedQualifier += "::";
@@ -590,7 +591,7 @@
   SigHelp.signatures.push_back(ProcessOverloadCandidate(
   Candidate, *CCS,
   getParameterDocComment(S.getASTContext(), Candidate, CurrentArg,
- /*CommentsFromHeader=*/false)));
+ /*CommentsFromHeaders=*/false)));
 }
   }
 
@@ -696,7 +697,7 @@
   &DummyDiagsConsumer, false),
   Input.VFS);
   if (!CI) {
-log("Couldn't create CompilerInvocation");;
+log("Couldn't create CompilerInvocation");
 return false;
   }
   auto &FrontendOpts = CI->getFrontendOpts();
@@ -1013,7 +1014,7 @@
 LLVM_DEBUG(llvm::dbgs()
<< "CodeComplete: " << C.Name << (IndexResult ? " (index)" : "")
<< (SemaResult ? " (sema)" : "") << " = " << Scores.finalScore
-   << "\n" 
+   << "\n"
<< Quality << Relevance << "\n");
 
 NSema += bool(SemaResult);
@@ -1035,7 +1036,8 @@
/*CommentsFromHeader=*/false);
   }
 }
-return Candidate.build(FileName, Scores, Opts, SemaCCS, Includes.get(), 
DocComment);
+return Candidate.build(FileName, Scores, Opts, SemaCCS, Includes.get(),
+   DocComment);
   }
 };
 


Index: clang-tools-extra/trunk/clangd/tool/ClangdMain.cpp
===
--- clang-tools-extra/trunk/clangd/tool/ClangdMain.cpp
+++ clang-tools-extra/trunk/clangd/tool/ClangdMain.cpp
@@ -33,7 +33,7 @@
 // Build an in-memory static index for global symbols from a YAML-format file.
 // The size of global symbols should be relatively small, so that all symbols
 // can be managed in memory.
-std::unique_ptr BuildStaticIndex(llvm::StringRef YamlSymbolFile) {
+std::unique_ptr buildStaticIndex(

[PATCH] D47471: [clangd] Minor cleanup

2018-05-29 Thread Kirill Bobyrev via Phabricator via cfe-commits
omtcyfz added a comment.

Thank you, Eric!


Repository:
  rL LLVM

https://reviews.llvm.org/D47471



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D51029: [clangd] Implement LIMIT iterator

2018-08-23 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 162183.
kbobyrev marked 3 inline comments as done.
kbobyrev added a comment.

Address a round comments from Sam.


https://reviews.llvm.org/D51029

Files:
  clang-tools-extra/clangd/index/dex/Iterator.cpp
  clang-tools-extra/clangd/index/dex/Iterator.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp

Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -234,13 +234,13 @@
   Root->advanceTo(1);
   Root->advanceTo(0);
   EXPECT_EQ(Root->peek(), 1U);
-  auto ElementBoost = Root->consume(Root->peek());
+  auto ElementBoost = Root->consume();
   EXPECT_THAT(ElementBoost, 6);
   Root->advance();
   EXPECT_EQ(Root->peek(), 5U);
   Root->advanceTo(5);
   EXPECT_EQ(Root->peek(), 5U);
-  ElementBoost = Root->consume(Root->peek());
+  ElementBoost = Root->consume();
   EXPECT_THAT(ElementBoost, 8);
   Root->advanceTo(9000);
   EXPECT_TRUE(Root->reachedEnd());
@@ -265,24 +265,26 @@
 }
 
 TEST(DexIndexIterators, Limit) {
-  const PostingList L0 = {4, 7, 8, 20, 42, 100};
-  const PostingList L1 = {1, 3, 5, 8, 9};
-  const PostingList L2 = {1, 5, 7, 9};
-  const PostingList L3 = {0, 5};
-  const PostingList L4 = {0, 1, 5};
-  const PostingList L5;
+  const PostingList L0 = {3, 6, 7, 20, 42, 100};
+  const PostingList L1 = {1, 3, 5, 6, 7, 30, 100};
+  const PostingList L2 = {0, 3, 5, 7, 8, 100};
 
   auto DocIterator = create(L0);
-  EXPECT_THAT(consumeIDs(*DocIterator, 42), ElementsAre(4, 7, 8, 20, 42, 100));
+  EXPECT_THAT(consumeIDs(*DocIterator, 42), ElementsAre(3, 6, 7, 20, 42, 100));
 
   DocIterator = create(L0);
-  EXPECT_THAT(consumeIDs(*DocIterator), ElementsAre(4, 7, 8, 20, 42, 100));
+  EXPECT_THAT(consumeIDs(*DocIterator), ElementsAre(3, 6, 7, 20, 42, 100));
 
   DocIterator = create(L0);
-  EXPECT_THAT(consumeIDs(*DocIterator, 3), ElementsAre(4, 7, 8));
+  EXPECT_THAT(consumeIDs(*DocIterator, 3), ElementsAre(3, 6, 7));
 
   DocIterator = create(L0);
   EXPECT_THAT(consumeIDs(*DocIterator, 0), ElementsAre());
+
+  auto AndIterator =
+  createAnd(createLimit(createTrue(9000), 343), createLimit(create(L0), 2),
+createLimit(create(L1), 3), createLimit(create(L2), 42));
+  EXPECT_THAT(consumeIDs(*AndIterator), ElementsAre(3, 7));
 }
 
 TEST(DexIndexIterators, True) {
@@ -301,28 +303,28 @@
 TEST(DexIndexIterators, Boost) {
   auto BoostIterator = createBoost(createTrue(5U), 42U);
   EXPECT_FALSE(BoostIterator->reachedEnd());
-  auto ElementBoost = BoostIterator->consume(BoostIterator->peek());
+  auto ElementBoost = BoostIterator->consume();
   EXPECT_THAT(ElementBoost, 42U);
 
   const PostingList L0 = {2, 4};
   const PostingList L1 = {1, 4};
   auto Root = createOr(createTrue(5U), createBoost(create(L0), 2U),
createBoost(create(L1), 3U));
 
-  ElementBoost = Root->consume(Root->peek());
+  ElementBoost = Root->consume();
   EXPECT_THAT(ElementBoost, Iterator::DEFAULT_BOOST_SCORE);
   Root->advance();
   EXPECT_THAT(Root->peek(), 1U);
-  ElementBoost = Root->consume(Root->peek());
+  ElementBoost = Root->consume();
   EXPECT_THAT(ElementBoost, 3);
 
   Root->advance();
   EXPECT_THAT(Root->peek(), 2U);
-  ElementBoost = Root->consume(Root->peek());
+  ElementBoost = Root->consume();
   EXPECT_THAT(ElementBoost, 2);
 
   Root->advanceTo(4);
-  ElementBoost = Root->consume(Root->peek());
+  ElementBoost = Root->consume();
   EXPECT_THAT(ElementBoost, 3);
 }
 
Index: clang-tools-extra/clangd/index/dex/Iterator.h
===
--- clang-tools-extra/clangd/index/dex/Iterator.h
+++ clang-tools-extra/clangd/index/dex/Iterator.h
@@ -87,13 +87,14 @@
   ///
   /// Note: reachedEnd() must be false.
   virtual DocID peek() const = 0;
-  /// Retrieves boosting score. Query tree root should pass Root->peek() to this
-  /// function, the parameter is needed to propagate through the tree. Given ID
-  /// should be compared against BOOST iterator peek()s: some of the iterators
-  /// would not point to the item which was propagated to the top of the query
-  /// tree (e.g. if these iterators are branches of OR iterator) and hence
-  /// shouldn't apply any boosting to the consumed item.
-  virtual float consume(DocID ID) = 0;
+  /// Informs the iterator that the current document was consumed, and returns
+  /// its boost.
+  ///
+  /// Note: If this iterator has any child iterators that contain the document,
+  /// consume() should be called on those and their boosts incorporated.
+  /// consume() must *not* be called on children that don't contain the current
+  /// doc.
+  virtual float consume() = 0;
 
   virtual ~Iterator() {}
 
@@ -165,6 +166,13 @@
 std::unique_ptr createBoost(std::unique_ptr Child,
   float Factor);
 
+/// Returns LIMIT iterator, which yield

[PATCH] D51154: [clangd] Log memory usage of DexIndex and MemIndex

2018-08-23 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 162184.
kbobyrev marked 2 inline comments as done.
kbobyrev added a comment.

Slightly simplify the code.


https://reviews.llvm.org/D51154

Files:
  clang-tools-extra/clangd/index/FileIndex.cpp
  clang-tools-extra/clangd/index/FileIndex.h
  clang-tools-extra/clangd/index/Index.h
  clang-tools-extra/clangd/index/MemIndex.cpp
  clang-tools-extra/clangd/index/MemIndex.h
  clang-tools-extra/clangd/index/Merge.cpp
  clang-tools-extra/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/clangd/index/dex/DexIndex.h
  clang-tools-extra/unittests/clangd/CodeCompleteTests.cpp

Index: clang-tools-extra/unittests/clangd/CodeCompleteTests.cpp
===
--- clang-tools-extra/unittests/clangd/CodeCompleteTests.cpp
+++ clang-tools-extra/unittests/clangd/CodeCompleteTests.cpp
@@ -923,6 +923,10 @@
llvm::function_ref
Callback) const override {}
 
+  // This is incorrect, but IndexRequestCollector is not an actual index and it
+  // isn't used in production code.
+  size_t estimateMemoryUsage() const override { return 0; }
+
   const std::vector allRequests() const { return Requests; }
 
 private:
Index: clang-tools-extra/clangd/index/dex/DexIndex.h
===
--- clang-tools-extra/clangd/index/dex/DexIndex.h
+++ clang-tools-extra/clangd/index/dex/DexIndex.h
@@ -57,7 +57,10 @@
llvm::function_ref
Callback) const override;
 
+  size_t estimateMemoryUsage() const override;
+
 private:
+
   mutable std::mutex Mutex;
 
   std::shared_ptr> Symbols /*GUARDED_BY(Mutex)*/;
Index: clang-tools-extra/clangd/index/dex/DexIndex.cpp
===
--- clang-tools-extra/clangd/index/dex/DexIndex.cpp
+++ clang-tools-extra/clangd/index/dex/DexIndex.cpp
@@ -67,6 +67,9 @@
 InvertedIndex = std::move(TempInvertedIndex);
 SymbolQuality = std::move(TempSymbolQuality);
   }
+
+  vlog("Built DexIndex with estimated memory usage {0} bytes.",
+   estimateMemoryUsage());
 }
 
 std::unique_ptr DexIndex::build(SymbolSlab Slab) {
@@ -171,6 +174,20 @@
   log("findOccurrences is not implemented.");
 }
 
+size_t DexIndex::estimateMemoryUsage() const {
+  std::lock_guard Lock(Mutex);
+
+  size_t Bytes =
+  LookupTable.size() * sizeof(std::pair);
+  Bytes += SymbolQuality.size() * sizeof(std::pair);
+  Bytes += InvertedIndex.size() * sizeof(Token);
+
+  for (const auto &P : InvertedIndex) {
+Bytes += P.second.size() * sizeof(DocID);
+  }
+  return Bytes;
+}
+
 } // namespace dex
 } // namespace clangd
 } // namespace clang
Index: clang-tools-extra/clangd/index/Merge.cpp
===
--- clang-tools-extra/clangd/index/Merge.cpp
+++ clang-tools-extra/clangd/index/Merge.cpp
@@ -84,6 +84,10 @@
 log("findOccurrences is not implemented.");
   }
 
+  size_t estimateMemoryUsage() const override {
+return Dynamic->estimateMemoryUsage() + Static->estimateMemoryUsage();
+  }
+
 private:
   const SymbolIndex *Dynamic, *Static;
 };
Index: clang-tools-extra/clangd/index/MemIndex.h
===
--- clang-tools-extra/clangd/index/MemIndex.h
+++ clang-tools-extra/clangd/index/MemIndex.h
@@ -39,7 +39,10 @@
llvm::function_ref
Callback) const override;
 
+  size_t estimateMemoryUsage() const override;
+
 private:
+
   std::shared_ptr> Symbols;
   // Index is a set of symbols that are deduplicated by symbol IDs.
   // FIXME: build smarter index structure.
Index: clang-tools-extra/clangd/index/MemIndex.cpp
===
--- clang-tools-extra/clangd/index/MemIndex.cpp
+++ clang-tools-extra/clangd/index/MemIndex.cpp
@@ -26,6 +26,9 @@
 Index = std::move(TempIndex);
 Symbols = std::move(Syms); // Relase old symbols.
   }
+
+  vlog("Built MemIndex with estimated memory usage {0} bytes.",
+   estimateMemoryUsage());
 }
 
 std::unique_ptr MemIndex::build(SymbolSlab Slab) {
@@ -98,5 +101,10 @@
   &Snap->Pointers);
 }
 
+size_t MemIndex::estimateMemoryUsage() const {
+  std::lock_guard Lock(Mutex);
+  return Index.getMemorySize();
+}
+
 } // namespace clangd
 } // namespace clang
Index: clang-tools-extra/clangd/index/Index.h
===
--- clang-tools-extra/clangd/index/Index.h
+++ clang-tools-extra/clangd/index/Index.h
@@ -385,6 +385,12 @@
   virtual void findOccurrences(
   const OccurrencesRequest &Req,
   llvm::function_ref Callback) const = 0;
+
+  /// Returns estimated size of index (in bytes).
+  // FIXME(kbobyrev): Currently, this only returns the size of index itself
+  // excluding the size of actual symbol slab i

[PATCH] D51154: [clangd] Log memory usage of DexIndex and MemIndex

2018-08-24 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 162334.
kbobyrev marked 3 inline comments as done.
kbobyrev added a comment.

Address few concerns.


https://reviews.llvm.org/D51154

Files:
  clang-tools-extra/clangd/index/FileIndex.cpp
  clang-tools-extra/clangd/index/FileIndex.h
  clang-tools-extra/clangd/index/Index.h
  clang-tools-extra/clangd/index/MemIndex.cpp
  clang-tools-extra/clangd/index/MemIndex.h
  clang-tools-extra/clangd/index/Merge.cpp
  clang-tools-extra/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/clangd/index/dex/DexIndex.h
  clang-tools-extra/unittests/clangd/CodeCompleteTests.cpp

Index: clang-tools-extra/unittests/clangd/CodeCompleteTests.cpp
===
--- clang-tools-extra/unittests/clangd/CodeCompleteTests.cpp
+++ clang-tools-extra/unittests/clangd/CodeCompleteTests.cpp
@@ -923,6 +923,10 @@
llvm::function_ref
Callback) const override {}
 
+  // This is incorrect, but IndexRequestCollector is not an actual index and it
+  // isn't used in production code.
+  size_t estimateMemoryUsage() const override { return 0; }
+
   const std::vector allRequests() const { return Requests; }
 
 private:
Index: clang-tools-extra/clangd/index/dex/DexIndex.h
===
--- clang-tools-extra/clangd/index/dex/DexIndex.h
+++ clang-tools-extra/clangd/index/dex/DexIndex.h
@@ -57,7 +57,10 @@
llvm::function_ref
Callback) const override;
 
+  size_t estimateMemoryUsage() const override;
+
 private:
+
   mutable std::mutex Mutex;
 
   std::shared_ptr> Symbols /*GUARDED_BY(Mutex)*/;
Index: clang-tools-extra/clangd/index/dex/DexIndex.cpp
===
--- clang-tools-extra/clangd/index/dex/DexIndex.cpp
+++ clang-tools-extra/clangd/index/dex/DexIndex.cpp
@@ -67,6 +67,9 @@
 InvertedIndex = std::move(TempInvertedIndex);
 SymbolQuality = std::move(TempSymbolQuality);
   }
+
+  vlog("Built DexIndex with estimated memory usage {0} bytes.",
+   estimateMemoryUsage());
 }
 
 std::unique_ptr DexIndex::build(SymbolSlab Slab) {
@@ -171,6 +174,20 @@
   log("findOccurrences is not implemented.");
 }
 
+size_t DexIndex::estimateMemoryUsage() const {
+  std::lock_guard Lock(Mutex);
+
+  size_t Bytes =
+  LookupTable.size() * sizeof(std::pair);
+  Bytes += SymbolQuality.size() * sizeof(std::pair);
+  Bytes += InvertedIndex.size() * sizeof(Token);
+
+  for (const auto &P : InvertedIndex) {
+Bytes += P.second.size() * sizeof(DocID);
+  }
+  return Bytes;
+}
+
 } // namespace dex
 } // namespace clangd
 } // namespace clang
Index: clang-tools-extra/clangd/index/Merge.cpp
===
--- clang-tools-extra/clangd/index/Merge.cpp
+++ clang-tools-extra/clangd/index/Merge.cpp
@@ -84,6 +84,10 @@
 log("findOccurrences is not implemented.");
   }
 
+  size_t estimateMemoryUsage() const override {
+return Dynamic->estimateMemoryUsage() + Static->estimateMemoryUsage();
+  }
+
 private:
   const SymbolIndex *Dynamic, *Static;
 };
Index: clang-tools-extra/clangd/index/MemIndex.h
===
--- clang-tools-extra/clangd/index/MemIndex.h
+++ clang-tools-extra/clangd/index/MemIndex.h
@@ -39,7 +39,10 @@
llvm::function_ref
Callback) const override;
 
+  size_t estimateMemoryUsage() const override;
+
 private:
+
   std::shared_ptr> Symbols;
   // Index is a set of symbols that are deduplicated by symbol IDs.
   // FIXME: build smarter index structure.
Index: clang-tools-extra/clangd/index/MemIndex.cpp
===
--- clang-tools-extra/clangd/index/MemIndex.cpp
+++ clang-tools-extra/clangd/index/MemIndex.cpp
@@ -26,6 +26,9 @@
 Index = std::move(TempIndex);
 Symbols = std::move(Syms); // Relase old symbols.
   }
+
+  vlog("Built MemIndex with estimated memory usage {0} bytes.",
+   estimateMemoryUsage());
 }
 
 std::unique_ptr MemIndex::build(SymbolSlab Slab) {
@@ -98,5 +101,10 @@
   &Snap->Pointers);
 }
 
+size_t MemIndex::estimateMemoryUsage() const {
+  std::lock_guard Lock(Mutex);
+  return Index.getMemorySize();
+}
+
 } // namespace clangd
 } // namespace clang
Index: clang-tools-extra/clangd/index/Index.h
===
--- clang-tools-extra/clangd/index/Index.h
+++ clang-tools-extra/clangd/index/Index.h
@@ -385,6 +385,12 @@
   virtual void findOccurrences(
   const OccurrencesRequest &Req,
   llvm::function_ref Callback) const = 0;
+
+  /// Returns estimated size of index (in bytes).
+  // FIXME(kbobyrev): Currently, this only returns the size of index itself
+  // excluding the size of actual symbol slab index r

[PATCH] D51154: [clangd] Log memory usage of DexIndex and MemIndex

2018-08-24 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev added a comment.

In https://reviews.llvm.org/D51154#1211376, @ioeric wrote:

> Do we plan to expose an API in `ClangdServer` to allow C++ API users to track 
> index memory usages?


I think we do, IIUC the conclusion of the offline discussion was that it might 
be useful for the clients. However, that would require `SymbolSlab` estimation, 
too (which is probably out of scope of this patch).


https://reviews.llvm.org/D51154



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D51029: [clangd] Implement LIMIT iterator

2018-08-24 Thread Kirill Bobyrev via Phabricator via cfe-commits
kbobyrev updated this revision to Diff 162341.
kbobyrev marked 7 inline comments as done.
kbobyrev added a comment.

Address a round of comments & simplify code.


https://reviews.llvm.org/D51029

Files:
  clang-tools-extra/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/clangd/index/dex/Iterator.cpp
  clang-tools-extra/clangd/index/dex/Iterator.h
  clang-tools-extra/unittests/clangd/DexIndexTests.cpp

Index: clang-tools-extra/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/unittests/clangd/DexIndexTests.cpp
@@ -30,8 +30,10 @@
 namespace {
 
 std::vector
-consumeIDs(Iterator &It, size_t Limit = std::numeric_limits::max()) {
-  auto IDAndScore = consume(It, Limit);
+consumeIDs(std::unique_ptr It,
+   size_t Limit = std::numeric_limits::max()) {
+  auto Root = createLimit(move(It), Limit);
+  auto IDAndScore = consume(*Root);
   std::vector IDs(IDAndScore.size());
   for (size_t I = 0; I < IDAndScore.size(); ++I)
 IDs[I] = IDAndScore[I].first;
@@ -71,7 +73,7 @@
   auto AndWithEmpty = createAnd(create(L0), create(L1));
   EXPECT_TRUE(AndWithEmpty->reachedEnd());
 
-  EXPECT_THAT(consumeIDs(*AndWithEmpty), ElementsAre());
+  EXPECT_THAT(consumeIDs(move(AndWithEmpty)), ElementsAre());
 }
 
 TEST(DexIndexIterators, AndTwoLists) {
@@ -81,7 +83,7 @@
   auto And = createAnd(create(L1), create(L0));
 
   EXPECT_FALSE(And->reachedEnd());
-  EXPECT_THAT(consumeIDs(*And), ElementsAre(0U, 7U, 10U, 320U, 9000U));
+  EXPECT_THAT(consumeIDs(move(And)), ElementsAre(0U, 7U, 10U, 320U, 9000U));
 
   And = createAnd(create(L0), create(L1));
 
@@ -122,7 +124,7 @@
   auto OrWithEmpty = createOr(create(L0), create(L1));
   EXPECT_FALSE(OrWithEmpty->reachedEnd());
 
-  EXPECT_THAT(consumeIDs(*OrWithEmpty),
+  EXPECT_THAT(consumeIDs(move(OrWithEmpty)),
   ElementsAre(0U, 5U, 7U, 10U, 42U, 320U, 9000U));
 }
 
@@ -155,7 +157,7 @@
 
   Or = createOr(create(L0), create(L1));
 
-  EXPECT_THAT(consumeIDs(*Or),
+  EXPECT_THAT(consumeIDs(move(Or)),
   ElementsAre(0U, 4U, 5U, 7U, 10U, 30U, 42U, 60U, 320U, 9000U));
 }
 
@@ -234,13 +236,13 @@
   Root->advanceTo(1);
   Root->advanceTo(0);
   EXPECT_EQ(Root->peek(), 1U);
-  auto ElementBoost = Root->consume(Root->peek());
+  auto ElementBoost = Root->consume();
   EXPECT_THAT(ElementBoost, 6);
   Root->advance();
   EXPECT_EQ(Root->peek(), 5U);
   Root->advanceTo(5);
   EXPECT_EQ(Root->peek(), 5U);
-  ElementBoost = Root->consume(Root->peek());
+  ElementBoost = Root->consume();
   EXPECT_THAT(ElementBoost, 8);
   Root->advanceTo(9000);
   EXPECT_TRUE(Root->reachedEnd());
@@ -265,64 +267,67 @@
 }
 
 TEST(DexIndexIterators, Limit) {
-  const PostingList L0 = {4, 7, 8, 20, 42, 100};
-  const PostingList L1 = {1, 3, 5, 8, 9};
-  const PostingList L2 = {1, 5, 7, 9};
-  const PostingList L3 = {0, 5};
-  const PostingList L4 = {0, 1, 5};
-  const PostingList L5;
+  const PostingList L0 = {3, 6, 7, 20, 42, 100};
+  const PostingList L1 = {1, 3, 5, 6, 7, 30, 100};
+  const PostingList L2 = {0, 3, 5, 7, 8, 100};
 
   auto DocIterator = create(L0);
-  EXPECT_THAT(consumeIDs(*DocIterator, 42), ElementsAre(4, 7, 8, 20, 42, 100));
+  EXPECT_THAT(consumeIDs(move(DocIterator), 42),
+  ElementsAre(3, 6, 7, 20, 42, 100));
 
   DocIterator = create(L0);
-  EXPECT_THAT(consumeIDs(*DocIterator), ElementsAre(4, 7, 8, 20, 42, 100));
+  EXPECT_THAT(consumeIDs(move(DocIterator)), ElementsAre(3, 6, 7, 20, 42, 100));
 
   DocIterator = create(L0);
-  EXPECT_THAT(consumeIDs(*DocIterator, 3), ElementsAre(4, 7, 8));
+  EXPECT_THAT(consumeIDs(move(DocIterator), 3), ElementsAre(3, 6, 7));
 
   DocIterator = create(L0);
-  EXPECT_THAT(consumeIDs(*DocIterator, 0), ElementsAre());
+  EXPECT_THAT(consumeIDs(move(DocIterator), 0), ElementsAre());
+
+  auto AndIterator =
+  createAnd(createLimit(createTrue(9000), 343), createLimit(create(L0), 2),
+createLimit(create(L1), 3), createLimit(create(L2), 42));
+  EXPECT_THAT(consumeIDs(move(AndIterator)), ElementsAre(3, 7));
 }
 
 TEST(DexIndexIterators, True) {
   auto TrueIterator = createTrue(0U);
   EXPECT_TRUE(TrueIterator->reachedEnd());
-  EXPECT_THAT(consumeIDs(*TrueIterator), ElementsAre());
+  EXPECT_THAT(consumeIDs(move(TrueIterator)), ElementsAre());
 
   PostingList L0 = {1, 2, 5, 7};
   TrueIterator = createTrue(7U);
   EXPECT_THAT(TrueIterator->peek(), 0);
   auto AndIterator = createAnd(create(L0), move(TrueIterator));
   EXPECT_FALSE(AndIterator->reachedEnd());
-  EXPECT_THAT(consumeIDs(*AndIterator), ElementsAre(1, 2, 5));
+  EXPECT_THAT(consumeIDs(move(AndIterator)), ElementsAre(1, 2, 5));
 }
 
 TEST(DexIndexIterators, Boost) {
   auto BoostIterator = createBoost(createTrue(5U), 42U);
   EXPECT_FALSE(BoostIterator->reachedEnd());
-  auto ElementBoost = BoostIterator->consume(BoostIterator->peek());
+  auto ElementBoost = BoostIterator->consume();
   EXPECT_THAT(Elemen

[PATCH] D51154: [clangd] Log memory usage of DexIndex and MemIndex

2018-08-24 Thread Kirill Bobyrev via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL340601: [clangd] Log memory usage of DexIndex and MemIndex 
(authored by omtcyfz, committed by ).
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D51154?vs=162334&id=162347#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D51154

Files:
  clang-tools-extra/trunk/clangd/index/FileIndex.cpp
  clang-tools-extra/trunk/clangd/index/FileIndex.h
  clang-tools-extra/trunk/clangd/index/Index.h
  clang-tools-extra/trunk/clangd/index/MemIndex.cpp
  clang-tools-extra/trunk/clangd/index/MemIndex.h
  clang-tools-extra/trunk/clangd/index/Merge.cpp
  clang-tools-extra/trunk/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/trunk/clangd/index/dex/DexIndex.h
  clang-tools-extra/trunk/unittests/clangd/CodeCompleteTests.cpp

Index: clang-tools-extra/trunk/clangd/index/FileIndex.cpp
===
--- clang-tools-extra/trunk/clangd/index/FileIndex.cpp
+++ clang-tools-extra/trunk/clangd/index/FileIndex.cpp
@@ -118,5 +118,9 @@
   log("findOccurrences is not implemented.");
 }
 
+size_t FileIndex::estimateMemoryUsage() const {
+  return Index.estimateMemoryUsage();
+}
+
 } // namespace clangd
 } // namespace clang
Index: clang-tools-extra/trunk/clangd/index/MemIndex.cpp
===
--- clang-tools-extra/trunk/clangd/index/MemIndex.cpp
+++ clang-tools-extra/trunk/clangd/index/MemIndex.cpp
@@ -26,6 +26,9 @@
 Index = std::move(TempIndex);
 Symbols = std::move(Syms); // Relase old symbols.
   }
+
+  vlog("Built MemIndex with estimated memory usage {0} bytes.",
+   estimateMemoryUsage());
 }
 
 std::unique_ptr MemIndex::build(SymbolSlab Slab) {
@@ -98,5 +101,10 @@
   &Snap->Pointers);
 }
 
+size_t MemIndex::estimateMemoryUsage() const {
+  std::lock_guard Lock(Mutex);
+  return Index.getMemorySize();
+}
+
 } // namespace clangd
 } // namespace clang
Index: clang-tools-extra/trunk/clangd/index/dex/DexIndex.cpp
===
--- clang-tools-extra/trunk/clangd/index/dex/DexIndex.cpp
+++ clang-tools-extra/trunk/clangd/index/dex/DexIndex.cpp
@@ -67,6 +67,9 @@
 InvertedIndex = std::move(TempInvertedIndex);
 SymbolQuality = std::move(TempSymbolQuality);
   }
+
+  vlog("Built DexIndex with estimated memory usage {0} bytes.",
+   estimateMemoryUsage());
 }
 
 std::unique_ptr DexIndex::build(SymbolSlab Slab) {
@@ -171,6 +174,20 @@
   log("findOccurrences is not implemented.");
 }
 
+size_t DexIndex::estimateMemoryUsage() const {
+  std::lock_guard Lock(Mutex);
+
+  size_t Bytes =
+  LookupTable.size() * sizeof(std::pair);
+  Bytes += SymbolQuality.size() * sizeof(std::pair);
+  Bytes += InvertedIndex.size() * sizeof(Token);
+
+  for (const auto &P : InvertedIndex) {
+Bytes += P.second.size() * sizeof(DocID);
+  }
+  return Bytes;
+}
+
 } // namespace dex
 } // namespace clangd
 } // namespace clang
Index: clang-tools-extra/trunk/clangd/index/dex/DexIndex.h
===
--- clang-tools-extra/trunk/clangd/index/dex/DexIndex.h
+++ clang-tools-extra/trunk/clangd/index/dex/DexIndex.h
@@ -57,7 +57,10 @@
llvm::function_ref
Callback) const override;
 
+  size_t estimateMemoryUsage() const override;
+
 private:
+
   mutable std::mutex Mutex;
 
   std::shared_ptr> Symbols /*GUARDED_BY(Mutex)*/;
Index: clang-tools-extra/trunk/clangd/index/Merge.cpp
===
--- clang-tools-extra/trunk/clangd/index/Merge.cpp
+++ clang-tools-extra/trunk/clangd/index/Merge.cpp
@@ -84,6 +84,10 @@
 log("findOccurrences is not implemented.");
   }
 
+  size_t estimateMemoryUsage() const override {
+return Dynamic->estimateMemoryUsage() + Static->estimateMemoryUsage();
+  }
+
 private:
   const SymbolIndex *Dynamic, *Static;
 };
Index: clang-tools-extra/trunk/clangd/index/Index.h
===
--- clang-tools-extra/trunk/clangd/index/Index.h
+++ clang-tools-extra/trunk/clangd/index/Index.h
@@ -385,6 +385,12 @@
   virtual void findOccurrences(
   const OccurrencesRequest &Req,
   llvm::function_ref Callback) const = 0;
+
+  /// Returns estimated size of index (in bytes).
+  // FIXME(kbobyrev): Currently, this only returns the size of index itself
+  // excluding the size of actual symbol slab index refers to. We should include
+  // both.
+  virtual size_t estimateMemoryUsage() const = 0;
 };
 
 } // namespace clangd
Index: clang-tools-extra/trunk/clangd/index/FileIndex.h
===
--- clang-tools-extra/trunk/clangd/index/FileIndex.h
+++ clang-tools-extra/trunk/clangd/index/FileIndex.h
@@ -81,6 +81,9 @@
   vo

[PATCH] D51029: [clangd] Implement LIMIT iterator

2018-08-24 Thread Kirill Bobyrev via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL340605: [clangd] Implement LIMIT iterator (authored by 
omtcyfz, committed by ).
Herald added a subscriber: llvm-commits.

Changed prior to commit:
  https://reviews.llvm.org/D51029?vs=162341&id=162359#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D51029

Files:
  clang-tools-extra/trunk/clangd/index/dex/DexIndex.cpp
  clang-tools-extra/trunk/clangd/index/dex/Iterator.cpp
  clang-tools-extra/trunk/clangd/index/dex/Iterator.h
  clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp

Index: clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp
===
--- clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp
+++ clang-tools-extra/trunk/unittests/clangd/DexIndexTests.cpp
@@ -29,9 +29,8 @@
 namespace dex {
 namespace {
 
-std::vector
-consumeIDs(Iterator &It, size_t Limit = std::numeric_limits::max()) {
-  auto IDAndScore = consume(It, Limit);
+std::vector consumeIDs(Iterator &It) {
+  auto IDAndScore = consume(It);
   std::vector IDs(IDAndScore.size());
   for (size_t I = 0; I < IDAndScore.size(); ++I)
 IDs[I] = IDAndScore[I].first;
@@ -234,13 +233,13 @@
   Root->advanceTo(1);
   Root->advanceTo(0);
   EXPECT_EQ(Root->peek(), 1U);
-  auto ElementBoost = Root->consume(Root->peek());
+  auto ElementBoost = Root->consume();
   EXPECT_THAT(ElementBoost, 6);
   Root->advance();
   EXPECT_EQ(Root->peek(), 5U);
   Root->advanceTo(5);
   EXPECT_EQ(Root->peek(), 5U);
-  ElementBoost = Root->consume(Root->peek());
+  ElementBoost = Root->consume();
   EXPECT_THAT(ElementBoost, 8);
   Root->advanceTo(9000);
   EXPECT_TRUE(Root->reachedEnd());
@@ -265,24 +264,23 @@
 }
 
 TEST(DexIndexIterators, Limit) {
-  const PostingList L0 = {4, 7, 8, 20, 42, 100};
-  const PostingList L1 = {1, 3, 5, 8, 9};
-  const PostingList L2 = {1, 5, 7, 9};
-  const PostingList L3 = {0, 5};
-  const PostingList L4 = {0, 1, 5};
-  const PostingList L5;
-
-  auto DocIterator = create(L0);
-  EXPECT_THAT(consumeIDs(*DocIterator, 42), ElementsAre(4, 7, 8, 20, 42, 100));
-
-  DocIterator = create(L0);
-  EXPECT_THAT(consumeIDs(*DocIterator), ElementsAre(4, 7, 8, 20, 42, 100));
-
-  DocIterator = create(L0);
-  EXPECT_THAT(consumeIDs(*DocIterator, 3), ElementsAre(4, 7, 8));
-
-  DocIterator = create(L0);
-  EXPECT_THAT(consumeIDs(*DocIterator, 0), ElementsAre());
+  const PostingList L0 = {3, 6, 7, 20, 42, 100};
+  const PostingList L1 = {1, 3, 5, 6, 7, 30, 100};
+  const PostingList L2 = {0, 3, 5, 7, 8, 100};
+
+  auto DocIterator = createLimit(create(L0), 42);
+  EXPECT_THAT(consumeIDs(*DocIterator), ElementsAre(3, 6, 7, 20, 42, 100));
+
+  DocIterator = createLimit(create(L0), 3);
+  EXPECT_THAT(consumeIDs(*DocIterator), ElementsAre(3, 6, 7));
+
+  DocIterator = createLimit(create(L0), 0);
+  EXPECT_THAT(consumeIDs(*DocIterator), ElementsAre());
+
+  auto AndIterator =
+  createAnd(createLimit(createTrue(9000), 343), createLimit(create(L0), 2),
+createLimit(create(L1), 3), createLimit(create(L2), 42));
+  EXPECT_THAT(consumeIDs(*AndIterator), ElementsAre(3, 7));
 }
 
 TEST(DexIndexIterators, True) {
@@ -301,28 +299,28 @@
 TEST(DexIndexIterators, Boost) {
   auto BoostIterator = createBoost(createTrue(5U), 42U);
   EXPECT_FALSE(BoostIterator->reachedEnd());
-  auto ElementBoost = BoostIterator->consume(BoostIterator->peek());
+  auto ElementBoost = BoostIterator->consume();
   EXPECT_THAT(ElementBoost, 42U);
 
   const PostingList L0 = {2, 4};
   const PostingList L1 = {1, 4};
   auto Root = createOr(createTrue(5U), createBoost(create(L0), 2U),
createBoost(create(L1), 3U));
 
-  ElementBoost = Root->consume(Root->peek());
+  ElementBoost = Root->consume();
   EXPECT_THAT(ElementBoost, Iterator::DEFAULT_BOOST_SCORE);
   Root->advance();
   EXPECT_THAT(Root->peek(), 1U);
-  ElementBoost = Root->consume(Root->peek());
+  ElementBoost = Root->consume();
   EXPECT_THAT(ElementBoost, 3);
 
   Root->advance();
   EXPECT_THAT(Root->peek(), 2U);
-  ElementBoost = Root->consume(Root->peek());
+  ElementBoost = Root->consume();
   EXPECT_THAT(ElementBoost, 2);
 
   Root->advanceTo(4);
-  ElementBoost = Root->consume(Root->peek());
+  ElementBoost = Root->consume();
   EXPECT_THAT(ElementBoost, 3);
 }
 
Index: clang-tools-extra/trunk/clangd/index/dex/Iterator.h
===
--- clang-tools-extra/trunk/clangd/index/dex/Iterator.h
+++ clang-tools-extra/trunk/clangd/index/dex/Iterator.h
@@ -87,13 +87,14 @@
   ///
   /// Note: reachedEnd() must be false.
   virtual DocID peek() const = 0;
-  /// Retrieves boosting score. Query tree root should pass Root->peek() to this
-  /// function, the parameter is needed to propagate through the tree. Given ID
-  /// should be compared against BOOST iterator peek()s: some of the iterators
-  /// would not point to the item which was propagated to the top

  1   2   3   4   5   6   7   8   9   10   >