virajjasani commented on PR #5396: URL: https://github.com/apache/hadoop/pull/5396#issuecomment-1480735197
@ayushtkn to provide you some update, the issue for which I was thinking of having this optional behavior is already fixed by https://github.com/apache/hadoop/commit/26fba8701c97928bb2ed2e6b456ab5ba9513e0fe We no longer see any transient connection failures after this commit. We are also trying to harmonize socket connection timeouts for all daemons to get them in sync with OS level settings. Hence we no longer need the functionality of this PR but we did end up building some level of resilience in k8s operators to deal with transient failures for future i.e. bounce the DN pod if it doesn't stay connected consistently to active NN pod. Had to get some connection ports accessible, etc :) Thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
