mikemccand commented on issue #14148: URL: https://github.com/apache/lucene/issues/14148#issuecomment-3019757238
I think the "wrapped" (inner) `MergePolicy` that `BandwidthCappedMergeScheduler.getMergePolicy()` would provide would be quite a bit simpler than TMP is today because it would not trigger merges based on complex criteria but rather when bandwidth is available. So an entirely new inner `MergePolicy` might be doable/cleaner. `BandwidthCappedMergeScheduler` will need much of what CMS already does today (run merges in bg threads, instrument the bytes written by each over time)? Maybe subclassing CMS, or maybe forking it temporarily (poaching its code) with a `TODO` to unfork? But it's a good idea to baby step first -- maybe the inner `MergePolicy` just wraps an actual `TieredMergePolicy` but filters the merges that TMP wanted to do so that those merges are only returned through the wrapping to `IndexWriter` when bandwidth allows? That's maybe a good first step -- it enables us to effectively make TMP defer its merge decisions until "a good time" (bandwidth freed up). For configuring the bandwidth, I think there needs to be some tolerance for not precisely hitting it. E.g. I ask for around 20 MiB/s long term average (the rate at which water comes out of faucet into bathtub), but the control algorithm cannot be perfect. E.g. up front it knows how large a merge is, but it doesn't know how long that merge will take -- that is so variable by application, but perhaps predictable over time as it sees how long merges "typically" take? So the docs for this class should make it clear it's kind of a rough target? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org