benwtrent commented on PR #13124: URL: https://github.com/apache/lucene/pull/13124#issuecomment-1961808528
@zhaih I don't see any reason why we also cannot extend the SerlialMS and allow multiple threads per merge ran. We will have to update the base `MergeScheduler` class anyways as the merges being ran don't know anything about who kicked off the merge (and shouldn't). The reason for all the CMS discussion is that it is the hardest to implement correctly (to me anyways...). For SerialMS, users could provide a number and its a static executor with that number of threads. > But still keep the current way of merge to keep the performance? I really don't think users should configure numWorkers or workPerThread at all. I would much prefer us supply good defaults and remove configuration. If we did the parallelism outside the codec to being with, I don't think we would have added any configurable values to the HNSW codec. > So if we bind those two together whether we potentially prevent a part of users using the intra-segment merges? I am not thinking about binding them. I think that `MergeScheduler` itself should be extended to return a TaskExecutor (probably defaulting to `null` to indicate none, or maybe `SameThreadExecutorService`). CMS is just the most difficult one to figure out. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org