mcvsubbu commented on issue #5751:
URL: 
https://github.com/apache/incubator-pinot/issues/5751#issuecomment-663679922


   Here is my suggested approach.
   1. Create a config in table that indicates that for a table we need the time 
boundary to be next unit up. (i.e. if daily push, next day, if hourly the next 
hour. etc.)
   2. Change the time boundary code to move up the time boundary when new 
segments are added to the table.
   
   In this case, there is no new API, but for use cases that have multiple 
segments, the time boundary will be advanced when the first segment is pushed. 
This will lead to inconsistency. 
   
   Pinot currently has the inconsistency problem only if RT is not there, but 
we will be introducing a new one now.
   
   We can, however, introduce an API as well, which can send out a message to 
the brokers to advnce time bondary. This API will be invoked by the push job 
when all segments are pushed. This iwll reduce the inconsistency window to some 
narrow cases such as:
   (1)  Three segments are being pushed. After the first one, a broker 
restarts, and lokoing at the table config, and the recent timestamp, advances 
the boundary. Since the other two segments are not pushed yet, the broker has 
no way of knowing that a message is going to come by later. Corner case, but 
maybe we can leave that for the day when we support "atomic" push of data.
   (2). Three segments are being pushed. The push complets, but the external 
view is not yet updated (slow server, slow helix, whatever). The broker 
meanwhile gets the message to move time boundary. 
   
   The second one is more likely to happen, but either one is a bad experience 
for the application. 
   
   Some mitigation may be to include the segment names in the controller api 
and the controller waits for the EV of these segments to be ONLINE before 
issuing a message, but that is also not complete. Broker may still not see the 
new EV
   
   Ideas?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to