icefury71 edited a comment on issue #4484:
URL: 
https://github.com/apache/incubator-pinot/issues/4484#issuecomment-676597798


   Thanks for the comments Subbu and Ting. Based on what @mcvsubbu mentioned, 
I'm thinking of adding a failure detector capability in the Pinot broker so 
that it can proactively prune bad server replicas. Although this is not a 
complete solution as mentioned, its still very useful for graceful degradation 
(instead of query failures).
   
   High level thoughts on the design: Have a failure detector interface 
(implemented with different algorithms) that keeps track of all servers in the 
External View and reports which Pinot servers are healthy. This can then be 
used by the Broker to determine where to route the query. 
   
   There are many different algorithms for failure detection in distributed 
systems ranging from periodic pings to piggybacking on the server responses to 
determine health after the fact.
   
   I'll add a formal design document for more details. Please let me know if 
there are any concerns up front. (CC @mayankshriv @kishoreg @Jackie-Jiang for 
visibility)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to