gsmiller commented on PR #11741:
URL: https://github.com/apache/lucene/pull/11741#issuecomment-1241017654

   @jpountz thanks for the feedback! If we assume a scenario where we have a 
`TermInSetQuery` over very selective terms (low docFreqs for each), we'd want 
to use the index query unless there's another clause that can lead that query 
that's significantly more restrictive. So there's no benefit of using 
`IndexOrDocValuesQuery` in that scenario, but moving the term lookup to the 
`scorerSupplier` also shouldn't hurt this case since we have to do it anyway.
   
   On the other hand, with today's implementation, we may not know that the 
`TermInSetQuery` is very selective (e.g., maybe there are terms in the field, 
but not in the query, that match a large number of documents). In this case, it 
may be beneficial to use the index-based query, but—because of our naive cost 
heuristic—we could end up using a doc-values query because we're significantly 
over-estimating the cost of the index query.
   
   I think the case where this approach would really hurt us is when the 
`TermInSetQuery` is not particularly restrictive—to the point that we end up 
using the doc-values query—but we have to pay this up-front cost to look up 
terms just to do decide we don't want to use the index-based query after all. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to