suddendust edited a comment on issue #7229: URL: https://github.com/apache/pinot/issues/7229#issuecomment-893269497
@mcvsubbu @kishoreg here's the updated FSM. I had not marked any transitions out-of MOVED earlier. <img width="784" alt="Screenshot 2021-08-05 at 11 00 15 AM" src="https://user-images.githubusercontent.com/84911643/128296427-b814f610-429a-4fbf-8248-84df1bc7960a.png"> We are trying to go with Presto -> Pinot -> S3 (no changes to the connector atm). I haven't really given a thought to how this design would behave when there are too many segments to load. With 2T/day and 500M segments, we have around 4000 segments per day. With a retention of 30 days, we are looking at 4000 * 30 = 120,000 segments. If someone makes a query that literally queries data for last 30 days, we might have to load all of them (yikes!). May be this can be controlled with a max segment config, as has been done in [this](https://github.com/apache/pinot/issues/6248) PR? I guess we can avoid much of the complexity by not touching the FSM? Essentially offloading all the download business to the servers itself - They determine they don't have the segment and trigger a download, an respond some time later asynchronously. >Lastly, let us ask ourselves why the broker needs to know that a segment has been moved. @mcvsubbu I gave it some thought and looks like it doesn't. It is the responsibility of the servers to furnish the segment - Either local or from the deep store. The broker just needs to query them and wait, as it is doing right now. It doesn't need to be concerned about where the segment is. This implementation does seem to come with a fair bit of warning - Too many segments to download, chatty cluster due to too many state transitions, long download times, etc. I am sure I am missing some minute but important details here, could definitely use some advice. Thanks :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org