mikemccand commented on issue #14148:
URL: https://github.com/apache/lucene/issues/14148#issuecomment-3019757238

   I think the "wrapped" (inner) `MergePolicy` that 
`BandwidthCappedMergeScheduler.getMergePolicy()` would provide would be quite a 
bit simpler than TMP is today because it would not trigger merges based on 
complex criteria but rather when bandwidth is available.  So an entirely new 
inner `MergePolicy` might be doable/cleaner.
   
   `BandwidthCappedMergeScheduler` will need much of what CMS already does 
today (run merges in bg threads, instrument the bytes written by each over 
time)?  Maybe subclassing CMS, or maybe forking it temporarily (poaching its 
code) with a `TODO` to unfork?
   
   But it's a good idea to baby step first -- maybe the inner `MergePolicy` 
just wraps an actual `TieredMergePolicy` but filters the merges that TMP wanted 
to do so that those merges are only returned through the wrapping to 
`IndexWriter` when bandwidth allows?  That's maybe a good first step -- it 
enables us to effectively make TMP defer its merge decisions until "a good 
time" (bandwidth freed up).
   
   For configuring the bandwidth, I think there needs to be some tolerance for 
not precisely hitting it.  E.g. I ask for around 20 MiB/s long term average 
(the rate at which water comes out of faucet into bathtub), but the control 
algorithm cannot be perfect.  E.g. up front it knows how large a merge is, but 
it doesn't know how long that merge will take -- that is so variable by 
application, but perhaps predictable over time as it sees how long merges 
"typically" take?  So the docs for this class should make it clear it's kind of 
a rough target?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to