gsmiller commented on code in PR #13201: URL: https://github.com/apache/lucene/pull/13201#discussion_r1704183629
########## lucene/core/src/java/org/apache/lucene/search/AbstractMultiTermQueryConstantScoreWrapper.java: ########## @@ -240,9 +219,38 @@ public ScorerSupplier scorerSupplier(LeafReaderContext context) throws IOExcepti return null; } - final long cost = estimateCost(terms, q.getTermsCount()); - IOSupplier<WeightOrDocIdSetIterator> weightOrIteratorSupplier = rewrite(context, terms); - if (weightOrIteratorSupplier == null) return null; + assert terms != null; + + final int fieldDocCount = terms.getDocCount(); + final TermsEnum termsEnum = q.getTermsEnum(terms); + assert termsEnum != null; + + List<TermAndState> collectedTerms = new ArrayList<>(); + boolean collectResult = collectTerms(fieldDocCount, termsEnum, collectedTerms); + + if (collectResult && collectedTerms.isEmpty()) return null; Review Comment: nit: stylistically at least, I think we generally prefer always using braces even for one-liners. Mind including them here? ########## lucene/core/src/java/org/apache/lucene/search/AbstractMultiTermQueryConstantScoreWrapper.java: ########## @@ -240,9 +219,38 @@ public ScorerSupplier scorerSupplier(LeafReaderContext context) throws IOExcepti return null; } - final long cost = estimateCost(terms, q.getTermsCount()); - IOSupplier<WeightOrDocIdSetIterator> weightOrIteratorSupplier = rewrite(context, terms); - if (weightOrIteratorSupplier == null) return null; + assert terms != null; + + final int fieldDocCount = terms.getDocCount(); + final TermsEnum termsEnum = q.getTermsEnum(terms); + assert termsEnum != null; + + List<TermAndState> collectedTerms = new ArrayList<>(); + boolean collectResult = collectTerms(fieldDocCount, termsEnum, collectedTerms); + + if (collectResult && collectedTerms.isEmpty()) return null; + + final long cost; + if (collectResult) { + long sumTermCost = 0; + for (TermAndState collectedTerm : collectedTerms) { + sumTermCost += collectedTerm.docFreq; Review Comment: Just sort of thinking out loud here, but I wonder how expensive it is to build the boolean query and grab its weight / scoreSupplier at this point (instead of lazily doing it in `scoreSupplier#get`)? If it's cheap, another option would be to invoke `rewriteAsBooleanQuery`, pull its score supplier and then delegate `#cost` to it. That way if the cost logic in `BooleanQuery` evolves in the future, this benefits as well. Probably not worth messing with this now, but maybe worth a TODO comment in here? That is, unless we know it is prohibitively expensive. ########## lucene/core/src/java/org/apache/lucene/search/AbstractMultiTermQueryConstantScoreWrapper.java: ########## @@ -240,9 +219,38 @@ public ScorerSupplier scorerSupplier(LeafReaderContext context) throws IOExcepti return null; } - final long cost = estimateCost(terms, q.getTermsCount()); - IOSupplier<WeightOrDocIdSetIterator> weightOrIteratorSupplier = rewrite(context, terms); - if (weightOrIteratorSupplier == null) return null; + assert terms != null; + + final int fieldDocCount = terms.getDocCount(); + final TermsEnum termsEnum = q.getTermsEnum(terms); + assert termsEnum != null; + + List<TermAndState> collectedTerms = new ArrayList<>(); + boolean collectResult = collectTerms(fieldDocCount, termsEnum, collectedTerms); + + if (collectResult && collectedTerms.isEmpty()) return null; + + final long cost; + if (collectResult) { Review Comment: We end up checking `collectResult` twice in the common case here. We could refactor this slightly to have one conditional block for when `collectResult == true` instead (and pull the `isEmpty` short-circuit check into it). WDYT? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org