jpountz commented on PR #13124:
URL: https://github.com/apache/lucene/pull/13124#issuecomment-1961602616

   Maybe some of these things are too ambitious, but ideally I'd like it to 
work this way.
   
   `ConcurrentMergeScheduler` already tracks a `maxMergeCount` which controls 
the max number of running merges and a `maxThreadCount` that tracks the max 
number of threads that merges may use at most. Ideally I'd like 
`maxThreadCount` to include both threads used for inter-merge concurrency and 
intra-merge concurrency. So this is similar to your first suggestion except 
that I'm bounding the total number of threads to `maxThreadCount` rather than 
`maxThreadCount + maxMergeCount`.
   
   Intra-merge concurrency would take advantage of the fact that there will 
sometimes be fewer active merges than threads to enable intra-merge 
concurrency. E.g. we could have a pool of threads for intra-merge concurrency 
that would try to ensure that its number of active threads is always less than 
or equals to `max(0, maxThreadCount - mergeThreads.size())`. For instance 
`Executor#execute` could be implemented such that it runs the runnable in the 
current thread if the number of active merges plus the number of active threads 
in the intra-merge thread pool is greater than or equal to `maxThreadCount`. 
Otherwise it would fork to the intra-merge thread pool.
   
   Concurrent merging for vectors wants to know the number of available workers 
today, but maybe we can change the logic (like you suggested) to split the doc 
ID space into some number of slices, e.g. max(128, maxDoc / 2^16), and 
sequentially send these slices to `Executor#execute` (sometimes running in the 
same thread, sometime forked to the intra-merge threadpool), except the last 
one that would be forced to run in the current thread (like we used to do in 
`IndexSearcher` until recently).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to