expani commented on issue #13745: URL: https://github.com/apache/lucene/issues/13745#issuecomment-3058268142
I was looking to integrate Intra Segment Concurrent Search and found that this same problem also applies to downstream consumers of Lucene like OpenSearch/ElasticSearch/Solr who use Collectors to build out their Aggregation framework. Since, we have to make a Query/Collector aware that they are participating in an Intra Segment Concurrent Search via Constructor like the [initial PR did for TotalHitCountCollectorManager ](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/TotalHitCountCollectorManager.java#L48-L62) the changes required would increase unless we go case by case basis. Recording the call flow for my own understanding ``` IndexSearcher#search(Query query, CollectorManager<C, T> collectorManager) --- calls --- IndexSearcher#search(Weight weight, CollectorManager<C, T> collectorManager, C firstCollector) --- calls from a Runnable per Slice --- IndexSearcher#search(LeafReaderContextPartition[] partitions, Weight weight, Collector collector) --- calls --- IndexSearcher#searchLeaf(LeafReaderContext ctx, int minDocId, int maxDocId, Weight weight, Collector collector) ``` Two threads can invoke `searchLeaf` with the same LeafReaderContext but for different partitions of the segment. Things inside `searchLeaf` that need to be done only once even during intra segment concurrent search ``` Collector#getLeafCollector() Weight#scorerSupplier() ScorerSupplier#bulkScorer() LeafReaderContext#reader()#getLiveDocs() LeafCollector#finish() ``` Other downstream users do some extra operations when profiling the queries. My proposal is to handle the de-duplication at the IndexSearcher and ensure the above listed steps are only done once per LeafSlice. @javanna I would like to pick this up unless you are almost done with PointRangeQuery. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org