[
https://issues.apache.org/jira/browse/LUCENE-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17575251#comment-17575251
]
Adrien Grand commented on LUCENE-8675:
--------------------------------------
I wonder if we could avoid paying the cost of Scorer/BulkScorer initialization
multiple times by implementing Cloneable on these classes, similarly to how we
use cloning on IndexInputs to consume them from multiple threads. It would
require implementing Cloneable on a few other classes, e.g. PostingsEnum, and
maybe we'd need to set some restrictions to keep this feature reasonable, e.g.
it's only legal to clone when the current doc ID is -1. But this could help
parallelize collecting a single segment by assigning each clone its own range
of doc IDs.
A downside of this approach is that it wouldn't help parallelize the
initialization of Scorers, but I don't know if there is a way around it.
> Divide Segment Search Amongst Multiple Threads
> ----------------------------------------------
>
> Key: LUCENE-8675
> URL: https://issues.apache.org/jira/browse/LUCENE-8675
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/search
> Reporter: Atri Sharma
> Priority: Major
> Attachments: PhraseHighFreqP50.png, PhraseHighFreqP90.png,
> TermHighFreqP50.png, TermHighFreqP90.png
>
>
> Segment search is a single threaded operation today, which can be a
> bottleneck for large analytical queries which index a lot of data and have
> complex queries which touch multiple segments (imagine a composite query with
> range query and filters on top). This ticket is for discussing the idea of
> splitting a single segment into multiple threads based on mutually exclusive
> document ID ranges.
> This will be a two phase effort, the first phase targeting queries returning
> all matching documents (collectors not terminating early). The second phase
> patch will introduce staged execution and will build on top of this patch.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]