mikemccand commented on PR #13430:
URL: https://github.com/apache/lucene/pull/13430#issuecomment-2137365637

   Thank you for tackling this @carlosdelest!  What a hairy challenge ... TMP 
really is its own little baby chess engine, with many things it is trying to 
optimize towards (what the ML world now calls "multi-objective"), playing 
against a worthy opponent (the indexing application that keeps indexing 
different sized docs, doing some deletions/updates, refresh/commit, etc., 
producing all kinds of exotic segments).  And of course in addition to the app 
exotically birthing new segments, TMP itself is causing new big segments to 
appear when each merge completes!  This really is a game theory problem, heh.
   
   +1 for the name `targetSearchConcurrency` -- it makes it clear this is about 
optimizing the "inter-segment concurrent search latency" in the resulting 
index, and, I think it also makes it clear that `maxMergedSegmentBytes` still 
wins.  I think simply aiming for the largest tier to allow 
`targetSearchConcurrency` segment count when that exceeds `segsPerTier` is a 
good approach.  After all it is those biggest segments that you really need for 
concurrency since searching them takes the most CPU / latency -- the lower 
tiers (1/10th, 1/100th, ... the size of the top tier) will be super fast by 
comparison and not the long pole in each query's latency.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to