kishoreg edited a comment on issue #7229:
URL: https://github.com/apache/pinot/issues/7229#issuecomment-893674195


   
   > Possibly two things to consider here:
   > 
   > * I am wondering if data in S3 needs to have the same granularity as the 
data in Pinot or can we aggregate the data to a higher-level dimension while 
aging it out to S3? For example, if data in the latest segment has a 
granularity of 1 second, then data in a segment 10 days old may have a 
granularity of 10 seconds (thereby reducing the data size by factor of 10), and 
data 30 days old may have a granularity of 1 hour (thereby reducing the data 
size by a factor of ~4000).
   
   This is a good point but I consider it more of optimization and should not 
be designed for this. There are cases where users would not want to compress 
the granularity. Moreover, one of the main reasoning here is s3 is cheaper and 
users want to keep the data for long.
   
   > * Also, would adding a segment cache between Pinot and S3 help with 
latency? Usually I would expect some sort of a locality of reference when we 
pull in data from S3
   
   This is definitely possible and something to consider once we have the 
ability to load a segment when we receive the query
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to