functioner commented on code in PR #1782:
URL: https://github.com/apache/zookeeper/pull/1782#discussion_r2684773134
##########
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Learner.java:
##########
@@ -410,8 +410,12 @@ private Socket connectToLeader() throws IOException,
X509Exception, InterruptedE
int remainingTimeout;
long startNanoTime = nanoTime();
- for (int tries = 0; tries < 5 && socket.get() == null; tries++) {
+ for (int tries = 0; tries < 5; tries++) {
try {
+ sock = createSocket();
+ if (socket.get() == null) {
+ continue;
Review Comment:
@PAC-MAJ
The "continue;" you mentioned handles the case that `sock = createSocket();`
gives null value. In this case, the abnormal null value comes from some
internal issue of the zk learner, not from leader's connection issue. Hence
introducing the backoff delay leaderConnectDelayDuringRetryMs will not make
sense, because that delay should focus on coordinating learn-leader connection
frequency.
In a broader sense, the loop `for (int tries = 0; tries < 5; tries++)` is
for the retry of creating socket AND connecting leader. Maybe this logic could
be decoupled as two phases? For example, write a retry for loop for the
leaderConnectDelayDuringRetryMs backoff delay, after successfully creating the
socket.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]