noahprince22 commented on issue #6189: URL: https://github.com/apache/incubator-pinot/issues/6189#issuecomment-724356911
The particular cases where this is necessary is when we have a large amount of data necessitating a large number of segments. In this particular situation, we want to make sure this scales. The naive solution does not scale particularly well on time when you get to millions of segments. The specialized data structure scales well on time, but is potentially a memory hog. The brokers like to keep _all_ of the routing tables in memory. That's going to get hungry. Can we quantify this? If there's 20 million segments across 10 tables, is it a gig of memory? 100 gigs? On the difficulty to implement front, why not just start with a [Java Sorted Set](https://docs.oracle.com/javase/8/docs/api/java/util/SortedSet.html) on start timestamp? Then filter on end timestamp on those results? Or use a pre canned library built for this specific task. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org