gsmiller commented on PR #11741: URL: https://github.com/apache/lucene/pull/11741#issuecomment-1241017654
@jpountz thanks for the feedback! If we assume a scenario where we have a `TermInSetQuery` over very selective terms (low docFreqs for each), we'd want to use the index query unless there's another clause that can lead that query that's significantly more restrictive. So there's no benefit of using `IndexOrDocValuesQuery` in that scenario, but moving the term lookup to the `scorerSupplier` also shouldn't hurt this case since we have to do it anyway. On the other hand, with today's implementation, we may not know that the `TermInSetQuery` is very selective (e.g., maybe there are terms in the field, but not in the query, that match a large number of documents). In this case, it may be beneficial to use the index-based query, but—because of our naive cost heuristic—we could end up using a doc-values query because we're significantly over-estimating the cost of the index query. I think the case where this approach would really hurt us is when the `TermInSetQuery` is not particularly restrictive—to the point that we end up using the doc-values query—but we have to pay this up-front cost to look up terms just to do decide we don't want to use the index-based query after all. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org