[GitHub] [lucene] jtibshirani commented on a diff in pull request #11756: LUCENE-10577: Remove LeafReader#searchNearestVectorsExhaustively

GitBox Wed, 07 Sep 2022 11:29:27 -0700


jtibshirani commented on code in PR #11756:
URL: https://github.com/apache/lucene/pull/11756#discussion_r965154356



##########
lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java:
##########
@@ -175,9 +176,42 @@ private TopDocs approximateSearch(LeafReaderContext 
context, Bits acceptDocs, in
   }
 
   // We allow this to be overridden so that tests can check what search 
strategy is used
-  protected TopDocs exactSearch(LeafReaderContext context, DocIdSetIterator 
acceptDocs)
+  protected TopDocs exactSearch(LeafReaderContext context, DocIdSetIterator 
acceptIterator)
       throws IOException {
-    return context.reader().searchNearestVectorsExhaustively(field, target, k, 
acceptDocs);
+    FieldInfo fi = context.reader().getFieldInfos().fieldInfo(field);
+    if (fi == null || fi.getVectorDimension() == 0) {
+      // The field does not exist or does not index vectors
+      return NO_RESULTS;
+    }
+
+    VectorScorer vectorScorer = VectorScorer.create(context, fi, target);
+    HitQueue queue = new HitQueue(k, true);
+    ScoreDoc topDoc = queue.top();
+    int doc;
+    while ((doc = acceptIterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) {
+      boolean advanced = vectorScorer.advanceExact(doc);
+      assert advanced;
+
+      float score = vectorScorer.score();
+      if (score >= topDoc.score) {

Review Comment:
   That seems right, I updated this and pushed a test covering the tie-breaking 
case. As a note, we don't guarantee we'll always return the lowest matching doc 
IDs (since approximate HNSW search can't do this efficiently).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] jtibshirani commented on a diff in pull request #11756: LUCENE-10577: Remove LeafReader#searchNearestVectorsExhaustively

Reply via email to