suddendust opened a new issue #7229:
URL: https://github.com/apache/pinot/issues/7229


   We have certain use-cases wherein we would like to move older data to cheap 
object stores like the S3 and store only the most recent data in Pinot. One 
such use-case is storing distributed trace data - Our query patterns show that 
more than 90% of the queries lie in the last 24 hours. Having said that, we 
have a retention period of 30 days. So in this case, we would like to keep only 
the last 24 hours worth of data in Pinot, and move the rest to a cheap store 
like the S3. 
   
   From what I concluded from our initial discussions, this would involve work 
on two fronts:
   
   1. Moving older data from Pinot to the S3.
   2. Pinot-Presto connector, so that it can query from both Pinot and the S3 
based on the time span of the query.
   
   I have something like this in mind (this is relevant to the distributed 
tracing example I gave above, excuse the rough drawing):
   
   ![Screenshot 2021-07-25 at 7 13 21 
PM](https://user-images.githubusercontent.com/84911643/127535064-3d017d81-9d04-47a0-8a8a-ded9c29a1b22.png)
   
   I have created this issue to get the discussion started.
   
   Thanks!
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to