nandakumar131 commented on a change in pull request #852: HDDS-1454. GC other
system pause events can trigger pipeline destroy for all the nodes in the
cluster. Contributed by Supratim Deka
URL: https://github.com/apache/hadoop/pull/852#discussion_r294695045
##########
File path:
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeStateManager.java
##########
@@ -464,6 +487,44 @@ public void setContainers(UUID uuid, Set<ContainerID>
containerIds)
@Override
public void run() {
+ if (shouldSkipCheck()) {
+ skippedHealthChecks++;
+ LOG.info("Detected long delay in scheduling HB processing thread. "
+ + "Skipping heartbeat checks for one iteration.");
+ } else {
+ checkNodesHealth();
+ }
+
+ // we purposefully make this non-deterministic. Instead of using a
+ // scheduleAtFixedFrequency we will just go to sleep
+ // and wake up at the next rendezvous point, which is currentTime +
+ // heartbeatCheckerIntervalMs. This leads to the issue that we are now
+ // heart beating not at a fixed cadence, but clock tick + time taken to
+ // work.
+ //
+ // This time taken to work can skew the heartbeat processor thread.
+ // The reason why we don't care is because of the following reasons.
+ //
+ // 1. checkerInterval is general many magnitudes faster than datanode HB
+ // frequency.
+ //
+ // 2. if we have too much nodes, the SCM would be doing only HB
+ // processing, this could lead to SCM's CPU starvation. With this
+ // approach we always guarantee that HB thread sleeps for a little while.
+ //
+ // 3. It is possible that we will never finish processing the HB's in the
+ // thread. But that means we have a mis-configured system. We will warn
+ // the users by logging that information.
+ //
+ // 4. And the most important reason, heartbeats are not blocked even if
+ // this thread does not run, they will go into the processing queue.
+ scheduleNextHealthCheck();
+
+ return;
Review comment:
We don't need this return statement.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]