mikemccand commented on PR #13430: URL: https://github.com/apache/lucene/pull/13430#issuecomment-2137365637
Thank you for tackling this @carlosdelest! What a hairy challenge ... TMP really is its own little baby chess engine, with many things it is trying to optimize towards (what the ML world now calls "multi-objective"), playing against a worthy opponent (the indexing application that keeps indexing different sized docs, doing some deletions/updates, refresh/commit, etc., producing all kinds of exotic segments). And of course in addition to the app exotically birthing new segments, TMP itself is causing new big segments to appear when each merge completes! This really is a game theory problem, heh. +1 for the name `targetSearchConcurrency` -- it makes it clear this is about optimizing the "inter-segment concurrent search latency" in the resulting index, and, I think it also makes it clear that `maxMergedSegmentBytes` still wins. I think simply aiming for the largest tier to allow `targetSearchConcurrency` segment count when that exceeds `segsPerTier` is a good approach. After all it is those biggest segments that you really need for concurrency since searching them takes the most CPU / latency -- the lower tiers (1/10th, 1/100th, ... the size of the top tier) will be super fast by comparison and not the long pole in each query's latency. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org