ioeric added a comment.

There are a few changes/refactorings which I would suggest splitting from this 
patch, as they would require more thoughts and seem unrelated in this patch. 
Please see the inline comments :)



================
Comment at: clang-tools-extra/clangd/Quality.h:130
+/// directories URIs.
+float parentProximityScore(unsigned Distance);
+
----------------
This should take in a `FileDistanceOptions` to allow customized up/down costs. 
To support general cases, this should probably take `UpDistance`, 
`DownDistance`, and `FileDistanceOptions`.


================
Comment at: clang-tools-extra/clangd/Quality.h:130
+/// directories URIs.
+float parentProximityScore(unsigned Distance);
+
----------------
ioeric wrote:
> This should take in a `FileDistanceOptions` to allow customized up/down 
> costs. To support general cases, this should probably take `UpDistance`, 
> `DownDistance`, and `FileDistanceOptions`.
Not sure if this is the right name. Two paths don't need to be parent and child 
(if we also consider down cost).  Maybe just `proximityScore`?


================
Comment at: clang-tools-extra/clangd/index/dex/DexIndex.cpp:42
 
+struct SymbolAndScore {
+  const Symbol *Sym;
----------------
Maybe `ScoredSymbol`? Is this used outside of `buildIndex`? If not, maybe just 
inline it? Or simply use a `std::pair`?


================
Comment at: clang-tools-extra/clangd/index/dex/DexIndex.cpp:59
     LookupTable[Sym->ID] = Sym;
-    SymbolQuality[Sym] = quality(*Sym);
+    SymbolAndScores[I] = {Sym, static_cast<float>(quality(*Sym))};
   }
----------------
We should fix `quality` to return `float` for consistency. 


================
Comment at: clang-tools-extra/clangd/index/dex/DexIndex.cpp:59
     LookupTable[Sym->ID] = Sym;
-    SymbolQuality[Sym] = quality(*Sym);
+    SymbolAndScores[I] = {Sym, static_cast<float>(quality(*Sym))};
   }
----------------
ioeric wrote:
> We should fix `quality` to return `float` for consistency. 
nit: Instead of reconstruct the structure, I'd probably set the fields manually.


================
Comment at: clang-tools-extra/clangd/index/dex/DexIndex.cpp:65
+  std::sort(begin(SymbolAndScores), end(SymbolAndScores),
+            std::greater<SymbolAndScore>());
+
----------------
Is the `operator>` overload used anywhere besides here? If not, I'd suggest 
inlining the overload.


================
Comment at: clang-tools-extra/clangd/index/dex/DexIndex.cpp:129
+    const auto ProximityURIs =
+        generateQueryProximityURIs(ProximityPath, URISchemes);
+    for (size_t Distance = 0; Distance < ProximityURIs.size(); ++Distance) {
----------------
IIUC, the assumption here is that `generateQueryProximityURIs` returns 
proximity tokens of parent directories sorted by distance. This seems to be 
making assumption about the implementation. `generateQueryProximityURIs` should 
return the distance explicitly along with the tokens.


================
Comment at: clang-tools-extra/clangd/index/dex/DexIndex.cpp:176
+  std::sort(begin(IDAndScores), end(IDAndScores),
+            std::greater<DocIDAndScore>());
 
----------------
Just inline the comparator for clarity.


================
Comment at: clang-tools-extra/clangd/index/dex/DexIndex.cpp:179
   // Retrieve top Req.MaxCandidateCount items.
-  std::priority_queue<std::pair<float, const Symbol *>> Top;
-  for (const auto &P : SymbolDocIDs) {
-    const DocID SymbolDocID = P.first;
+  const size_t ItemsToScore = 10 * Req.MaxCandidateCount;
+  TopN<DocIDAndScore> Top(Req.MaxCandidateCount);
----------------
This is adding another staging of filtering. It seems to be an optimization 
that's not specific to the proximity path change and would probably require 
more careful review. Can we split this out of this patch for now?


================
Comment at: clang-tools-extra/clangd/index/dex/DexIndex.h:80
 
+  /// Stores symbols in the descending order using symbol quality as the key.
   std::vector<const Symbol *> Symbols;
----------------
nit: `Sorted in the descending order of symbol quality.`


================
Comment at: clang-tools-extra/clangd/index/dex/DexIndex.h:83
+  /// Maps DocID to the quality of underlying symbol.
+  std::vector<float> SymbolQuality;
   llvm::DenseMap<SymbolID, const Symbol *> LookupTable;
----------------
nit: `SymbolQuality[I] is the quality of Symbols[I]. `


================
Comment at: clang-tools-extra/clangd/index/dex/Iterator.h:139
 /// to acquire preliminary scores of requested items.
-std::vector<std::pair<DocID, float>> consume(Iterator &It);
+std::vector<DocIDAndScore> consume(Iterator &It);
 
----------------
I'm not sure about this refactoring. It seems to be out of the scope. Can we 
split it out and still return pair for now?


================
Comment at: clang-tools-extra/clangd/index/dex/Token.cpp:36
+  size_t Limit = 5;
+  while (!Path.empty() && Limit--) {
+    // FIXME(kbobyrev): Parsing and encoding path to URIs is not necessary and
----------------
ioeric wrote:
> For the original URI, we could simply add it to the result outside of the 
> loop (and `--Limit`) to save one iteration?
nit: `(--Limit > 0)` for clarity.


================
Comment at: clang-tools-extra/clangd/index/dex/Token.h:100
+std::vector<Token>
+generateQueryProximityURIs(llvm::StringRef AbsolutePath,
+                           llvm::ArrayRef<std::string> URISchemes);
----------------
As mentioned in another comment, this should return distance along with tokens 
(instead of assuming consecutive parents and distances).


https://reviews.llvm.org/D51481



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to