yashmayya opened a new pull request, #15086: URL: https://github.com/apache/pinot/pull/15086
- After https://github.com/apache/pinot/pull/15005, the failure detector recovery mechanism relied on each handler reporting the server as healthy to be safe. - However, in use cases where only one query engine is being used, the server channels for the other engine would never have been established. The v1 Netty handler created the channel in the retry process in case only v2 queries were being used whereas the v2 gRPC channel wasn't created if only v1 queries were being used and the server would never be marked as healthy again until the end of the retry period. - Both of these behaviors are incorrect - we only want to retry the channels we've already established through queries. This patch fixes the retry mechanism by introducing a new `FailureDetector#ServerState` that can be healthy, unhealthy or unknown. The server will be marked as healthy again only if none of the retry handlers return unhealthy. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org