[GitHub] [lucene-solr] jpountz commented on a change in pull request #2205: LUCENE-9668: Deprecate MinShouldMatchSumScorer with WANDScorer

GitBox Sat, 16 Jan 2021 06:27:24 -0800


jpountz commented on a change in pull request #2205:
URL: https://github.com/apache/lucene-solr/pull/2205#discussion_r558902897




##########
File path: 
lucene/core/src/java/org/apache/lucene/search/Boolean2ScorerSupplier.java
##########
@@ -231,10 +231,27 @@ private Scorer opt(
         optionalScorers.add(scorer.get(leadCost));
       }
 
-      if (scoreMode == ScoreMode.TOP_SCORES) {
-        return new WANDScorer(weight, optionalScorers, minShouldMatch);
-      } else if (minShouldMatch > 1) {
-        return new MinShouldMatchSumScorer(weight, optionalScorers, 
minShouldMatch);
+      // nocommit
+      //
+      // The following updated condition follows the previous one that also 
utilized
+      // MinShouldMatchSumScorer However, technically speaking, WANDScorer is 
also able to
+      // handle the case where minShouldMatch == 0 (proven with existing 
possible condition of
+      // scoreMode == SoreMode.TOP_SCORES && minShouldMatch == 0), but 
removing the
+      // minShouldMatch > 1 condition would also stop WANDScorer to handle the 
case
+      // where scoreMode.needsScore() == false, which WANDScorer is also 
capable of
+      // (proven with existing possible condition of scoreMode != 
ScoreMode.TOP_SCORES &&
+      // minShouldMatch > 1).
+      //
+      // Ultimately, WANDScorer should be able to handle the following 
conditions now:
+      // 1. Any ScoreMode (with scoring or not), although it might be a bit 
slower compared to
+      //   DisjunctionSumScorer due to data structures and algorithms usd
+      // 2. Any minCompetitiveScore ( >= 0 )
+      // 3. Any minShouldMatch ( >= 0 )
+      //
+      // So it seems we might need a different condition check to 
differentiate usage between
+      // WANDScorer and DisjunctionSumScorer ?

Review comment:
       The current condition looks right to me. Maybe the comment could say 
something like `WANDScorer can handle all cases but we specialize exhaustive 
pure disjunctions with DisjunctionSumScorer which is a bit faster`?

##########
File path: 
lucene/core/src/java/org/apache/lucene/search/DisjunctionMaxScorer.java
##########
@@ -50,7 +50,7 @@
     if (tieBreakerMultiplier < 0 || tieBreakerMultiplier > 1) {
       throw new IllegalArgumentException("tieBreakerMultiplier must be in [0, 
1]");
     }
-    if (scoreMode == ScoreMode.TOP_SCORES) {
+    if (scoreMode.needsScores()) {

Review comment:
       why do we need to change this?

##########
File path: lucene/core/src/java/org/apache/lucene/search/WANDScorer.java
##########
@@ -319,7 +356,10 @@ private void pushBackLeads(int target) throws IOException {
       }
     }
     lead = null;
-    leadMaxScore = 0;
+
+    if (needsScore) {
+      leadMaxScore = 0;
+    }

Review comment:
       I wonder if this condition could be removed in order to make this class 
easier to understand by minimizing differences in code paths between the 
scoring and the non-scoring cases? Are there other branches we could remove?

##########
File path: lucene/core/src/java/org/apache/lucene/search/WANDScorer.java
##########
@@ -149,23 +167,33 @@ private static long scaleMinScore(float minScore, int 
scalingFactor) {
     this.doc = -1;
     this.upTo = -1; // will be computed on the first call to nextDoc/advance
 
+    this.needsScore = needsScore;
+
     head = new DisiPriorityQueue(scorers.size());
     // there can be at most num_scorers - 1 scorers beyond the current position
     tail = new DisiWrapper[scorers.size()];
 
-    OptionalInt scalingFactor = OptionalInt.empty();
-    for (Scorer scorer : scorers) {
-      scorer.advanceShallow(0);
-      float maxScore = scorer.getMaxScore(DocIdSetIterator.NO_MORE_DOCS);
-      if (maxScore != 0 && Float.isFinite(maxScore)) {
-        // 0 and +Infty should not impact the scale
-        scalingFactor =
-            OptionalInt.of(
-                Math.min(scalingFactor.orElse(Integer.MAX_VALUE), 
scalingFactor(maxScore)));
+    if (needsScore) {

Review comment:
       likewise here I would have expected to check `scoreMode == TOP_SCORES` 
rather than `scoreMode.needsScores()` since calling `advanceShallow` 
`getMaxScore` only really makes sense if you plan to skip documents based on 
scores?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on a change in pull request #2205: LUCENE-9668: Deprecate MinShouldMatchSumScorer with WANDScorer

Reply via email to