gsmiller commented on code in PR #11738: URL: https://github.com/apache/lucene/pull/11738#discussion_r966070400
########## lucene/core/src/java/org/apache/lucene/search/MultiTermQueryConstantScoreWrapper.java: ########## @@ -165,9 +143,46 @@ private WeightOrDocIdSet rewrite(LeafReaderContext context) throws IOException { PostingsEnum docs = null; - final List<TermAndState> collectedTerms = new ArrayList<>(); - if (collectTerms(context, termsEnum, collectedTerms)) { - // build a boolean query + // We will first try to collect up to 'threshold' terms into 'matchingTerms' + // if there are too many terms, we will fall back to building the 'builder' Review Comment: Thanks for the suggestion @rmuir. Let me see if I can use the existing code structure a bit more in this change. The reason I didn't want to just call `collectTerms` as-is is that we could unnecessarily seek and load term states when we've already found a term covering all docs. For example, if the first term we visit covers all terms, we can just stop there. I'm also not sure I'm following your point about reallocating `collectedTerms` as part of this change? That's certainly not my intention with this code, but maybe I'm staring at a bug and not realizing it? As soon as we hit the size threshold, we should be nulling out `collectTerms`, initializing a building and just using that for the remaining term iteration. Apologies if I'm overlooking something though. Entirely possible. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org