[ https://issues.apache.org/jira/browse/LUCENE-8319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17187102#comment-17187102 ]
Adrien Grand commented on LUCENE-8319: -------------------------------------- A problem with TimeLimitingCollector and ExitableDirectoryReader is that they add layers of abstraction to things that are called in very tight loops. One combination that we found to work well for Elasticsearch is to use ExitableDirectoryReader only for terms/points and make IndexSearcher wrap the top-level bulk scorer to split the doc ID space in exponentially growing windows of doc IDs and check the timeout between windows in order to keep the overhead to a minimum. Timeout handling seems to be a frequent need so maybe we should add support for it directly on IndexSearcher where we could more easily do the right thing? > A Time-limiting collector that works with CollectorManagers > ----------------------------------------------------------- > > Key: LUCENE-8319 > URL: https://issues.apache.org/jira/browse/LUCENE-8319 > Project: Lucene - Core > Issue Type: Improvement > Components: core/search > Reporter: Tony Xu > Priority: Minor > > Currently Lucene has *TimeLimitingCollector* to support time-bound collection > and it will throw > *TimeExceededException* if timeout happens. This only works nicely with the > single-thread low-level API from the IndexSearcher. The method signature is -- > *void search(List<LeafReaderContext> leaves, Weight weight, Collector > collector)* > The intended use is to always enclose the searcher.search(query, collector) > call with a try ... catch and handle the timeout exception. Unfortunately > when working with a *CollectorManager* in the multi-thread search context, > the *TimeExceededException* thrown during collecting one leaf slice will be > re-thrown by *IndexSearcher* without calling *CollectorManager*'s reduce(), > even if other slices are successfully collected. The signature > of the search api with *CollectorManager* is -- > *<C extends Collector, T> T search(Query query, CollectorManager<C, T> > collectorManager)* > > The good news is that IndexSearcher handles *CollectionTerminatedException* > gracefully by ignoring it. We can either wrap TimeLimitingCollector and throw > *CollectionTerminatedException* when timeout happens or simply replace > *TimeExceededException* with *CollectionTerminatedException*. In either way, > we also need to maintain a flag that indicates if timeout occurred so that > the user know it's a partial collection. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org