Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]

via GitHub Wed, 20 Mar 2024 05:23:21 -0700


benwtrent commented on PR #13190:
URL: https://github.com/apache/lucene/pull/13190#issuecomment-2009404137


   @mikemccand @jpountz 
   
   So, thinking about this more as I fell asleep.
   
   This is how throttling will work as it is in this PR:
   
    - Throttling is per thread. Meaning, a intra-merge thread only gets 
throttled once the bytes it has written get to the rate limit set
    - Consequently, we may get to `rateLimitBytes*numIntraMergeIO` bytes before 
a single thread gets throttled.
   
   This makes us throttle less often given how many bytes are actually written. 
Maybe this is OK as the throttling logic could "catch up" if things continue to 
get backed up? Throttling has always been a "best effort" thing anyways.
   
   Even if we somehow made the RateLimiter throttle every thread (via some 
global semaphore or something...), we would still only throttle once one of the 
multiple threads hit the byte throttling limit.
   
   IMO, either we: 
   
    - account for bytes globally (per directory) & throttle globally (global 
lock that pauses all threads)
    - accept that throttling is per thread and bytes used is measured in 
individual threads.
   
   Selfishly, I wish for the second option as its the simplest. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Add new parallel merge task executor for parallel actions within a single merge action [lucene]

Reply via email to