[PR] Update FailureDetector recovery logic to not break if only one query engine is being used [pinot]

via GitHub Wed, 19 Feb 2025 00:20:27 -0800


yashmayya opened a new pull request, #15086:
URL: https://github.com/apache/pinot/pull/15086


   - After https://github.com/apache/pinot/pull/15005, the failure detector 
recovery mechanism relied on each handler reporting the server as healthy to be 
safe.
   - However, in use cases where only one query engine is being used, the 
server channels for the other engine would never have been established. The v1 
Netty handler created the channel in the retry process in case only v2 queries 
were being used whereas the v2 gRPC channel wasn't created if only v1 queries 
were being used and the server would never be marked as healthy again until the 
end of the retry period.
   - Both of these behaviors are incorrect - we only want to retry the channels 
we've already established through queries. This patch fixes the retry mechanism 
by introducing a new `FailureDetector#ServerState` that can be healthy, 
unhealthy or unknown. The server will be marked as healthy again only if none 
of the retry handlers return unhealthy.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[PR] Update FailureDetector recovery logic to not break if only one query engine is being used [pinot]

Reply via email to