mikemccand commented on PR #14443:
URL: https://github.com/apache/lucene/pull/14443#issuecomment-2844952688

   > Sorry I'm still a bit confused: how is this approach better than just 
committing more frequently, replicating commits as soon as they are created, 
and refreshing searchers as soon as commits are replicated?
   
   In Amazon's usage, and I would expect other high-rate NRT segment 
replication systems, it's helpful to strongly decouple the production of the 
new commit points (triggered by time, or by X GB new segments), from the 
replication of these commits.
   
   During peace time (happy path), what you're suggesting works well -- commit 
and replication can always match each other 1 for 1.  But under duress (unhappy 
path), commit point production can be faster (usually, hopefully, temporarily) 
than the replication can keep up, maybe because crazy high rate of document 
updates, or slow pipe for replication, or bit errors needing lots of retries, 
etc. (distributed systems seem to have all sorts of fun ways to become 
problematic!) ... and for those situations, it's really nice to have this 
possible decoupling easily accessible.
   
   It's also delightful because Lucene makes it quite simple to keep more than 
one commit point alive at once (it's just a custom `IndexDeletionPolicy`) ... 
building this decoupling on top of that is "relatively" easy heh.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to