msfroh commented on issue #9721:
URL: https://github.com/apache/lucene/issues/9721#issuecomment-2098929299

   > jpountz said:
   > It depends on queries. For term queries, duplicating the overhead of 
looking up terms in the terms dict may be ok, but for multi-term queries and 
point queries that often compute the bit set of matches of the whole segment, 
this could significantly hurt throughput. Maybe it doesn't have to be this way 
for the first iteration (progress over perfection), but this feels important to 
me so that we don't have weird recommendations like "only enable intra-segment 
concurrency if you don't use multi-term or point queries".
   
   I was thinking a bit about intra-segment concurrency this morning and got 
thinking specifically about multi-term, point, and vector queries that do most 
of their heavy-lifting up front (to the point where I've seen a bunch of 
profiles where relatively little time is spent actually iterating through 
DISIs). 
   
   Those queries (or at least their ScorerSuppliers) "know" when they're going 
to be expensive, so it feels like they're in the best position to say "I should 
be parallelized".  What if ScorerSupplier could take a reference to the 
IndexSearcher's executor and return a CompletableFuture for the Scorer? 
Something like TermQuery could return a "completed" future, while "expensive" 
scorers could be computed on another thread. It could be a quick and easy way 
to parallelize some of the per-segment computation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to