Re: [PR] Respect minCompetitiveScore in BlockMaxConjunctionBulkScorer [lucene]

via GitHub Tue, 03 Jun 2025 11:59:33 -0700


jpountz commented on code in PR #14751:
URL: https://github.com/apache/lucene/pull/14751#discussion_r2124651785



##########
lucene/core/src/java/org/apache/lucene/search/BlockMaxConjunctionBulkScorer.java:
##########
@@ -81,18 +83,20 @@ private float computeMaxScore(int windowMin, int windowMax) 
throws IOException {
   public int score(LeafCollector collector, Bits acceptDocs, int min, int max) 
throws IOException {
     collector.setScorer(scorable);
 
-    int windowMin = Math.max(lead.docID(), min);
+    int windowMin = scoreDocFirstUntilDynamicPruning(collector, acceptDocs, 
min, max);
+
     while (windowMin < max) {
-      // Use impacts of the least costly scorer to compute windows
       // NOTE: windowMax is inclusive
-      int windowMax = Math.min(scorers[0].advanceShallow(windowMin), max - 1);
-
-      if (0 < scorable.minCompetitiveScore) {
-        float maxWindowScore = computeMaxScore(windowMin, windowMax);
-        scoreWindowScoreFirst(collector, acceptDocs, windowMin, windowMax + 1, 
maxWindowScore);
+      int leadBlockEnd = scorers[0].advanceShallow(windowMin);
+      int windowMax = max - 1;
+      if (leadBlockEnd == DocIdSetIterator.NO_MORE_DOCS) {
+        windowMax = (int) Math.min(windowMin + DEFAULT_WINDOW_SIZE, windowMax);

Review Comment:
   I wonder if we should do it all the time, and not only when `leadBlockEnd == 
DocIdSetIterator.NO_MORE_DOCS`. Essentially, we'd be forcing the max block 
score to be re-evaluated at least every 65k docs, which feels like it shouldn't 
have too much overhead, and should help in more situations than your current 
approach, what do you think?



##########
lucene/core/src/java/org/apache/lucene/search/BlockMaxConjunctionBulkScorer.java:
##########
@@ -166,11 +174,12 @@ private void scoreWindowScoreFirst(
       docAndScoreAccBuffer.copyFrom(docAndScoreBuffer);
 
       for (int i = 1; i < scorers.length; ++i) {
-        ScorerUtil.filterCompetitiveHits(
-            docAndScoreAccBuffer,
-            sumOfOtherClauses[i],
-            scorable.minCompetitiveScore,
-            scorers.length);
+        double sumOfOtherClause = sumOfOtherClauses[i];
+        if (sumOfOtherClause != Double.POSITIVE_INFINITY

Review Comment:
   I'm tempted to make this optimization more general by checking if `(float) 
MathUtil.sumUpperBound(maxRemainingScore, numScorers) > minCompetitiveScore` in 
`ScorerUtil#filterCompetitiveHits` (before looping)?



##########
lucene/core/src/java/org/apache/lucene/search/BlockMaxConjunctionBulkScorer.java:
##########
@@ -166,11 +174,12 @@ private void scoreWindowScoreFirst(
       docAndScoreAccBuffer.copyFrom(docAndScoreBuffer);
 
       for (int i = 1; i < scorers.length; ++i) {
-        ScorerUtil.filterCompetitiveHits(
-            docAndScoreAccBuffer,
-            sumOfOtherClauses[i],
-            scorable.minCompetitiveScore,
-            scorers.length);
+        double sumOfOtherClause = sumOfOtherClauses[i];
+        if (sumOfOtherClause != Double.POSITIVE_INFINITY
+            && sumOfOtherClause != sumOfOtherClauses[i - 1]) {

Review Comment:
   It took me some time to understand why this works, can you add a code 
comment, e.g. "two equal consecutive values mean that the first clause always 
returns a score of zero, so we don't need to filter hits by score again" 
(assuming I understand correctly)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Respect minCompetitiveScore in BlockMaxConjunctionBulkScorer [lucene]

Reply via email to