virajjasani commented on PR #5396:
URL: https://github.com/apache/hadoop/pull/5396#issuecomment-1480735197

   @ayushtkn to provide you some update, the issue for which I was thinking of 
having this optional behavior is already fixed by 
https://github.com/apache/hadoop/commit/26fba8701c97928bb2ed2e6b456ab5ba9513e0fe
   We no longer see any transient connection failures after this commit. We are 
also trying to harmonize socket connection timeouts for all daemons to get them 
in sync with OS level settings.
   
   Hence we no longer need the functionality of this PR but we did end up 
building some level of resilience in k8s operators to deal with transient 
failures for future i.e. bounce the DN pod if it doesn't stay connected 
consistently to active NN pod. Had to get some connection ports accessible, etc 
:)
   
   Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to