benwtrent commented on code in PR #14978:
URL: https://github.com/apache/lucene/pull/14978#discussion_r2246229973


##########
lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsReader.java:
##########
@@ -344,15 +346,33 @@ private void search(
       HnswGraphSearcher.search(
           scorer, collector, getGraph(fieldEntry), acceptedOrds, 
filteredDocCount);
     } else {
-      // if k is larger than the number of vectors, we can just iterate over 
all vectors
-      // and collect them
+      // if k is larger than the number of vectors we expect to visit in an 
HNSW search,
+      // we can just iterate over all vectors and collect them.
+      int[] ords = new int[EXHAUSTIVE_BULK_SCORE_ORDS];
+      float[] scores = new float[EXHAUSTIVE_BULK_SCORE_ORDS];
+      int numOrds = 0;
       for (int i = 0; i < scorer.maxOrd(); i++) {
         if (acceptedOrds == null || acceptedOrds.get(i)) {
           if (knnCollector.earlyTerminated()) {
             break;
           }
+          ords[numOrds++] = i;
+          if (numOrds == ords.length) {
+            scorer.bulkScore(ords, scores, numOrds);
+            for (int j = 0; j < numOrds; j++) {
+              knnCollector.incVisitedCount(1);
+              knnCollector.collect(scorer.ordToDoc(ords[j]), scores[j]);
+            }
+            numOrds = 0;
+          }
+        }
+      }
+
+      if (numOrds > 0) {
+        scorer.bulkScore(ords, scores, numOrds);

Review Comment:
   for `bulkScore` I think it would be beneficial for the API to return 
`maxScore`. This way the collection and iteration can be skipped if the best 
score isn't competitive. 
   
   I realize this "complicates" the `incVisitedCount`, but I think that can be 
fixed by pulling up `knnCollector.incVisitedCount(numOrds)`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to