harshavamsi commented on issue #9721:
URL: https://github.com/apache/lucene/issues/9721#issuecomment-2099280561

   > > jpountz said:
   > > It depends on queries. For term queries, duplicating the overhead of 
looking up terms in the terms dict may be ok, but for multi-term queries and 
point queries that often compute the bit set of matches of the whole segment, 
this could significantly hurt throughput. Maybe it doesn't have to be this way 
for the first iteration (progress over perfection), but this feels important to 
me so that we don't have weird recommendations like "only enable intra-segment 
concurrency if you don't use multi-term or point queries".
   > 
   > I was thinking a bit about intra-segment concurrency this morning and got 
thinking specifically about multi-term, point, and vector queries that do most 
of their heavy-lifting up front (to the point where I've seen a bunch of 
profiles where relatively little time is spent actually iterating through 
DISIs).
   > 
   > Those queries (or at least their ScorerSuppliers) "know" when they're 
going to be expensive, so it feels like they're in the best position to say "I 
should be parallelized". What if ScorerSupplier could take a reference to the 
IndexSearcher's executor and return a CompletableFuture for the Scorer? 
Something like TermQuery could return a "completed" future, while "expensive" 
scorers could be computed on another thread. It could be a quick and easy way 
to parallelize some of the per-segment computation.
   
   To add on to this, I was wondering if we could further extend the concurrent 
logic within a query. For example, in range queries today we traverse the BKD 
over the whole range. What if we could split the range and give them to an 
executor to intersect the range? Then we could construct the DISI through 
multiple threads.
   
   Similarly in a terms query, we could get each term to parallely create their 
BitSets/Iterators and then conjunction/disjunctions over them can happen all at 
once. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to