mayya-sharipova commented on a change in pull request #267:
URL: https://github.com/apache/lucene/pull/267#discussion_r700580884



##########
File path: lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraph.java
##########
@@ -107,32 +113,82 @@ public static NeighborQueue search(
       Random random)
       throws IOException {
     int size = graphValues.size();
+    int boundedNumSeed = Math.min(numSeed, 2 * size);
+    NeighborQueue results;
+
+    if (graphValues.maxLevel() == 0) {
+      // search in SNW; generate a number of entry points randomly
+      final int[] eps = new int[boundedNumSeed];
+      for (int i = 0; i < boundedNumSeed; i++) {
+        eps[i] = random.nextInt(size);
+      }
+      return searchLevel(query, topK, 0, eps, vectors, similarityFunction, 
graphValues, acceptOrds);
+    } else {
+      // search in hierarchical SNW
+      int[] eps = new int[] {graphValues.entryNode()};
+      for (int level = graphValues.maxLevel(); level >= 1; level--) {
+        results =
+            HnswGraph.searchLevel(
+                query, 1, level, eps, vectors, similarityFunction, 
graphValues, acceptOrds);
+        eps = new int[] {results.pop()};
+      }
+      results =
+          HnswGraph.searchLevel(
+              query, boundedNumSeed, 0, eps, vectors, similarityFunction, 
graphValues, acceptOrds);
+      while (results.size() > topK) {
+        results.pop();
+      }
+      return results;
+    }
+  }
 
+  /**
+   * Searches for the nearest neighbors of a query vector in a given level
+   *
+   * @param query search query vector
+   * @param topK the number of nearest to query results to return

Review comment:
       Indeed, it is usually the case except these conditions:
   
   1. When doing a hierarchical search.  As we reach level 0, we use a single 
`ep` found from the level 1 (`eps.length=1`), while topK is > 1.
   2. When building a hierarchical graph.   When adding a node, as we reach 
`nodeLevel`   -  a starting level for this node,  we use a single `ep` found 
from the `level = nodeLevel + 1`, while using topk = efconstruction.
   
   I am happy to discuss alternative API where we may not need topK even for 
these cases, for exampling expanding `eps` array to have at least `topK` 
elements.
   
    




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to