[ 
https://issues.apache.org/jira/browse/LUCENE-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17530000#comment-17530000
 ] 

Greg Miller commented on LUCENE-10544:
--------------------------------------

{quote}In my opinion, a better solution that has less overhead and would still 
support cancelling such slow queries consists of leveraging 
{{BulkScorer#score}} to score small-ish ranges of doc IDs at a time.
{quote}
+1. We've had success by implementing a "timeout enforcing" Query that does 
timeout enforcement within the Scorer it provides as a short-term solution, but 
there are a number of flaws with this approach. Hooking into the BulkScorer 
makes sense but does need some thought as [~dpsharma] mentions since Queries 
may (and do!) provide their own BulkScorers in some cases (e.g., 
{{{}BooleanScorer{}}}).
{quote}Long-term I'd like ExitableDirectoryReader and other tooling to handle 
cancellation/timeout to become mostly implementation details, and have proper 
support directly on IndexSearcher (LUCENE-10151).
{quote}
+1. For full disclosure, [~dpsharma] and I work together at Amazon and she is 
working on LUCENE-10151. One idea is to use {{ExitableDirectoryReader}} as an 
internal implementation detail of {{IndexSearcher}} to add first-class timeout 
support. While we were debugging some prototype code, we ran into this issue 
with {{ExitableDirectoryReader}} and I thought it warranted a spin-off issue 
since it seems like something we might want to generally fix.

> Should ExitableTermsEnum wrap postings and impacts?
> ---------------------------------------------------
>
>                 Key: LUCENE-10544
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10544
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/index
>            Reporter: Greg Miller
>            Priority: Major
>
> While looking into options for LUCENE-10151, I noticed that 
> {{ExitableDirectoryReader}} doesn't actually do any timeout checking once you 
> start iterating postings/impacts. It *does* create a {{ExitableTermsEnum}} 
> wrapper when loading a {{{}TermsEnum{}}}, but that wrapper doesn't do 
> anything to wrap postings or impacts. So timeouts will be enforced when 
> moving to the "next" term, but not when iterating the postings/impacts 
> associated with a term.
> I think we ought to wrap the postings/impacts as well with some form of 
> timeout checking so timeouts can be enforced on long-running queries. I'm not 
> sure why this wasn't done originally (back in 2014), but it was questioned 
> back in 2020 on the original Jira SOLR-5986. Does anyone know of a good 
> reason why we shouldn't enforce timeouts in this way?
> Related, we may also want to wrap things like {{seekExact}} and {{seekCeil}} 
> given that only {{next}} is being wrapped currently.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to