[ 
https://issues.apache.org/jira/browse/HADOOP-12569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133513#comment-15133513
 ] 

Tao Jie commented on HADOOP-12569:
----------------------------------

In our version, we just force kill namenode process before zkfc quit due to 
zookeeper connection failure by shell cmd.
Maybe we should add a STOP command in HAServiceProtocol?
Any thoughts?

> ZKFC should stop namenode before itself quit in some circumstances
> ------------------------------------------------------------------
>
>                 Key: HADOOP-12569
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12569
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: ha
>    Affects Versions: 2.6.0
>            Reporter: Tao Jie
>
> We have met such a HA scenario:
> NN1(active) and zkfc1 on node1;
> NN2(standby) and zkfc2 on node2.
> 1,Stop network on node1, NN2 becomes active. On node2, zkfc2 kills itself 
> since it cannot connect to zookeeper, but leaving NN1 still running.
> 2,Several minutes later, network on node1 recovers. NN1 is running but out of 
> control. NN1 and NN2 both run as active nn.
> Maybe zkfc should stop nn before quit in such circumstances.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to