mikemccand commented on issue #14148: URL: https://github.com/apache/lucene/issues/14148#issuecomment-2607224591
Doing this in `MergeScheduler` (`MS`) is indeed another option. It'd mean you could cap replication bandwidth independent of your `MergePolicy` (`MP`). `MS` could even fine-tune where it does its throttling, e.g. maybe it's fine to run many merges at once, if "bandwidth" to the local storage of your `IndexWriter` is not a problem, but then once the merge completes, don't commit the merge to the index (swap in the newly merged segment in place of the old ones) until enough time has passed (in fact, NRT replica warming of newly merged segments takes this same approach: delay merge-commit long enough so replicas can pre-copy newly merged segments, ensuring NRT latency (index to searchability delay) stays low even when copying massive merged segments). I.e. "throttle at start" or "throttle at end" are two sub-options to doing this in `MS` I guess "throttle during" is also an option, just like the IO rate limiter can do today ... in fact, the IO rate limiter is already one way to cap bandwidth. But the big problem with doing this throttling late in the game (In `MS` not `MP`) is that `MP` can then make poor choices. I.e. maybe `MP` decides to "merge these 5 segments now". It submits the merge, but `MS` waits before starting the merge (to cap the bandwidth). Say it waits ten minutes... well, after those ten minutes, `MP` might now want to make a different/better merge choice, since the index looks different (after 10 minutes of indexing the updates stream), yet because it already submitted the prior (now stale) merge, it is not allowed to pick any of those five segments since they are now held out as merging. This would effectively mean `MP` is forced to make choices based on stale/delayed information. It's sort of like the good old days of day-traders who only had access to the "20 minute delayed stock prices". But maybe that handicap to TMP would be fine in practice? Maybe, the choices it makes based on stale index geometry are not so different from what it would make with "live" stock prices? Though, if it was a max-sized merge, and enough updates/deletes arrive in those ten minutes, then `MP` would've been able to merge maybe a 6th segment into the same merge while staying under the max segment size. Not sure ... Or maybe even some combination of the two approaches? `MP` could look at IW, see that there is backpressure (paused merges because `MS` has to stay under bandwidth cap), and delay picking new merges even if it wants to? Maybe that's a simple solution ... `MS` throttles, and `MP` detects that backpressure. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org