Re: [PR] Use vector bulk scoring in entry-point and filter hnsw search [lucene]

via GitHub Fri, 12 Dec 2025 19:22:42 -0800


john-wagster commented on code in PR #15500:
URL: https://github.com/apache/lucene/pull/15500#discussion_r2616025867



##########
lucene/core/src/java/org/apache/lucene/util/hnsw/AbstractHnswGraphSearcher.java:
##########
@@ -81,4 +82,38 @@ public void search(
     }
     searchLevel(results, scorer, 0, eps, graph, acceptOrds);
   }
+
+  protected static void scoreEntryPoints(
+      KnnCollector results,
+      RandomVectorScorer scorer,
+      BitSet visited,
+      int[] eps,
+      Bits acceptOrds,
+      NeighborQueue candidates,
+      float[] scores)
+      throws IOException {
+    assert eps != null && eps.length > 0;
+    assert scores != null && scores.length >= eps.length;
+    if (eps.length == 1) {
+      visited.set(eps[0]);
+      float score = scorer.score(eps[0]);
+      results.incVisitedCount(1);
+      candidates.add(eps[0], score);
+      if (acceptOrds == null || acceptOrds.get(eps[0])) {
+        results.collect(eps[0], score);
+      }
+    } else {
+      scorer.bulkScore(eps, scores, eps.length);

Review Comment:
   Saying this outloud in case my assumption are wrong.  I assume the reason we 
don't need benchmarking here is that we know from prior work at the leaf level 
that there's definitely a benefit to bulk scoring eps here instead of doing an 
early termination check for each entry point.  



##########
lucene/core/src/java/org/apache/lucene/util/hnsw/AbstractHnswGraphSearcher.java:
##########
@@ -81,4 +82,38 @@ public void search(
     }
     searchLevel(results, scorer, 0, eps, graph, acceptOrds);
   }
+
+  protected static void scoreEntryPoints(
+      KnnCollector results,
+      RandomVectorScorer scorer,
+      BitSet visited,
+      int[] eps,
+      Bits acceptOrds,
+      NeighborQueue candidates,
+      float[] scores)
+      throws IOException {
+    assert eps != null && eps.length > 0;
+    assert scores != null && scores.length >= eps.length;
+    if (eps.length == 1) {
+      visited.set(eps[0]);
+      float score = scorer.score(eps[0]);
+      results.incVisitedCount(1);
+      candidates.add(eps[0], score);
+      if (acceptOrds == null || acceptOrds.get(eps[0])) {
+        results.collect(eps[0], score);
+      }
+    } else {
+      scorer.bulkScore(eps, scores, eps.length);

Review Comment:
   Saying this outloud in case my assumption is wrong.  I assume the reason we 
don't need benchmarking here is that we know from prior work at the leaf level 
that there's definitely a benefit to bulk scoring eps here instead of doing an 
early termination check for each entry point.  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Use vector bulk scoring in entry-point and filter hnsw search [lucene]

Reply via email to