suddendust commented on issue #7229:
URL: https://github.com/apache/pinot/issues/7229#issuecomment-893269497


   @mcvsubbu @kishoreg here's the updated FSM. I had not marked any transitions 
out-of MOVED earlier. 
   
   <img width="784" alt="Screenshot 2021-08-05 at 11 00 15 AM" 
src="https://user-images.githubusercontent.com/84911643/128296427-b814f610-429a-4fbf-8248-84df1bc7960a.png";>
   
   We are trying to go with Presto -> Pinot -> S3 (no changes to the connector 
atm).
   
   I haven't really given a thought to how this design would behave when there 
are too many segments to load. With 2T/day and 500M segments, we have around 
4000 segments per day. With a retention of 30 days, we are looking at 4000 * 30 
= 120,000 segments. If someone makes a query that literally queries data for 
last 30 days, we might have to load all of them (yikes!). May be this can be 
controlled with a max segment config, as has been done in 
[this](https://github.com/apache/pinot/issues/6248) PR?
   
   I guess we can avoid much of the complexity by not touching the FSM? 
Essentially offloading all the download business to the servers itself - They 
determine they don't have the segment and trigger a download, an respond some 
time later asynchronously.
   
   >Lastly, let us ask ourselves why the broker needs to know that a segment 
has been moved.
   
   @mcvsubbu I gave it some thought and looks like it doesn't. It is the 
responsibility of the servers to furnish the segment - Either local or from the 
deep store. The broker just needs to query them and wait, as it is doing right 
now.
   
   This implementation does seem to come with a fair bit of warning - Too many 
segments to download, chatty cluster due to too many state transitions, long 
download times, etc. 
   
   I am sure I am missing some minute but important details here, so would 
appreciate some expert guidance. Thanks :)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to