nandakumar131 commented on a change in pull request #711: HDDS-1368. Cleanup
old ReplicationManager code from SCM.
URL: https://github.com/apache/hadoop/pull/711#discussion_r276116040
##########
File path:
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/DeadNodeHandler.java
##########
@@ -18,121 +18,155 @@
package org.apache.hadoop.hdds.scm.node;
-import java.util.Set;
+import java.io.IOException;
+import java.util.Optional;
import org.apache.hadoop.hdds.protocol.DatanodeDetails;
import org.apache.hadoop.hdds.protocol.proto.HddsProtos;
import org.apache.hadoop.hdds.scm.container.ContainerException;
-import org.apache.hadoop.hdds.scm.container.ContainerID;
import org.apache.hadoop.hdds.scm.container.ContainerInfo;
import org.apache.hadoop.hdds.scm.container.ContainerManager;
import org.apache.hadoop.hdds.scm.container.ContainerNotFoundException;
-import org.apache.hadoop.hdds.scm.container.ContainerReplica;
-import org.apache.hadoop.hdds.scm.container.replication.ReplicationRequest;
-import org.apache.hadoop.hdds.scm.events.SCMEvents;
import org.apache.hadoop.hdds.scm.node.states.NodeNotFoundException;
+import org.apache.hadoop.hdds.scm.pipeline.PipelineManager;
+import org.apache.hadoop.hdds.scm.pipeline.PipelineNotFoundException;
import org.apache.hadoop.hdds.server.events.EventHandler;
import org.apache.hadoop.hdds.server.events.EventPublisher;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
+import static org.apache.hadoop.hdds.scm.events.SCMEvents.CLOSE_CONTAINER;
+
/**
* Handles Dead Node event.
*/
public class DeadNodeHandler implements EventHandler<DatanodeDetails> {
- private final ContainerManager containerManager;
-
private final NodeManager nodeManager;
+ private final PipelineManager pipelineManager;
+ private final ContainerManager containerManager;
private static final Logger LOG =
LoggerFactory.getLogger(DeadNodeHandler.class);
- public DeadNodeHandler(NodeManager nodeManager,
- ContainerManager containerManager) {
- this.containerManager = containerManager;
+ public DeadNodeHandler(final NodeManager nodeManager,
+ final PipelineManager pipelineManager,
+ final ContainerManager containerManager) {
this.nodeManager = nodeManager;
+ this.pipelineManager = pipelineManager;
+ this.containerManager = containerManager;
}
@Override
- public void onMessage(DatanodeDetails datanodeDetails,
- EventPublisher publisher) {
+ public void onMessage(final DatanodeDetails datanodeDetails,
+ final EventPublisher publisher) {
- // TODO: check if there are any pipeline on this node and fire close
- // pipeline event
- Set<ContainerID> ids =
- null;
try {
- ids = nodeManager.getContainers(datanodeDetails);
- } catch (NodeNotFoundException e) {
+
+ /*
+ * We should have already destroyed all the pipelines on this datanode
+ * when it was marked as stale. Destroy pipeline should also have closed
+ * all the containers on this datanode.
+ *
+ * Ideally we should not have any pipeline or OPEN containers now.
+ *
+ * To be on a safer side, we double check here and take appropriate
+ * action.
+ */
+
+ destroyPipelines(datanodeDetails);
+ closeContainers(datanodeDetails, publisher);
Review comment:
TLDR; destoryPipelines internally calls close container already.
There are three stages where we close the containers when a datanode goes
dead
1. When the node is marked as stale, we destroy the pipeline which will try
to close the containers cleanly.
(Finalizing and destroying the pipeline after the timeout)
-> When the node is marked as dead
2. destroyPipelines(datanodeDetails) --> call
Here we destroy the pipeline which will also try to close the containers,
here we don't try to close the container cleanly. This is a fallback mechanism,
if everything was working fine we should not have any pipelines at this point.
(Finalizing and destroying the pipeline without timeout)
3. closeContainers(datanodeDetails, publisher) --> call
This is the last resort for closing a container, it is to handle a weird
case where there is open container and it is not part of any pipeline. [This
can (or) should not happen]
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]