[GitHub] [incubator-pinot] noahprince22 commented on issue #6189: Support Timestamp Pruning of segments in Broker

GitBox Mon, 09 Nov 2020 15:59:43 -0800


noahprince22 commented on issue #6189:
URL: 
https://github.com/apache/incubator-pinot/issues/6189#issuecomment-724356911



   The particular cases where this is necessary is when we have a large amount 
of data necessitating a large number of segments. In this particular situation, 
we want to make sure this scales.
   
   The naive solution does not scale particularly well on time when you get to 
millions of segments. 
   
   The specialized data structure scales well on time, but is potentially a 
memory hog. The brokers like to keep _all_ of the routing tables in memory. 
That's going to get hungry. Can we quantify this? If there's 20 million 
segments across 10 tables, is it a gig of memory? 100 gigs?
   
   On the difficulty to implement front, why not just start with a [Java Sorted 
Set](https://docs.oracle.com/javase/8/docs/api/java/util/SortedSet.html) on 
start timestamp? Then filter on end timestamp on those results? Or use a pre 
canned library built for this specific task. 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] noahprince22 commented on issue #6189: Support Timestamp Pruning of segments in Broker

Reply via email to