From Cloudera’s guide, there should have a downtime when moving Jounal Nodes: https://www.cloudera.com/documentation/enterprise/5-7-x/topics/admin_nn_migrate_roles.html#concept_w3h_m2l_2r
And a ticket from Community about this problem which is still unresolved: https://issues.apache.org/jira/browse/HDFS-10665 For the exceptional journal node, do you have tried to collect system metrics and profile it possibly to identify the root cause? From: 孙锐 [mailto:[email protected]] Sent: Thursday, October 26, 2017 11:06 AM To: [email protected] Subject: How to add new journal nodes without service downtime? HI, folks, We are using Hadoop 2.6.0 (CDH version) with 3 journal nodes. We want to add 2 more journal nodes to the existing 3 ones. We tried to add nodes without service downtime according to some posts in the community, but that seems not reliable. It seems that adding or moving journal nodes requires downtime of the HDFS service. Is this correct? Another question is that if one Journal node has been down or slow for some time (lag far way behind other journal nodes), can the journal node be brought back to work by simply restarting it? Or moving it to another machine is required? It seems that operational guide for journal nodes is missing in the official documentation. Thanks, Ray
