kishoreg edited a comment on issue #7229: URL: https://github.com/apache/pinot/issues/7229#issuecomment-893674195
> Possibly two things to consider here: > > * I am wondering if data in S3 needs to have the same granularity as the data in Pinot or can we aggregate the data to a higher-level dimension while aging it out to S3? For example, if data in the latest segment has a granularity of 1 second, then data in a segment 10 days old may have a granularity of 10 seconds (thereby reducing the data size by factor of 10), and data 30 days old may have a granularity of 1 hour (thereby reducing the data size by a factor of ~4000). This is a good point but I consider it more of optimization and should not be designed for this. There are cases where users would not want to compress the granularity. Moreover, one of the main reasoning here is s3 is cheaper and users want to keep the data for long. > * Also, would adding a segment cache between Pinot and S3 help with latency? Usually I would expect some sort of a locality of reference when we pull in data from S3 This is definitely possible and something to consider once we have the ability to load a segment when we receive the query -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org