Re: HDFS HA(Based on QJM) Failover Frequently with Large FSimage andBusy Requests

gu.yizhou Thu, 27 Apr 2017 05:17:15 -0700

1. Is service-rpc configured in namenode?


Not yet, I was considered to configure servicerpc, but I was thinking about the 
possible disadvantages as well. 


When failover is happened  because of too many waiting rpcs, if zkfc gets 
normal process from another port, is it possiable that the clients get a lot of 
failures?






2. ha.health-monitor.rpc-timeout.ms - Also consider increasing zkfc rpc call 
timeout to namenode. 


The same worry,  is it possiable that the clients get a lot of failures?






Thanks very much,


Doris










---------------------------------------------------------------------------------------










1. Is service-rpc configured in namenode?
(dfs.namenode.servicerpc-address - this will create another RPC server 
listening on another port (say 8021) to handle all service (non-client) 
requests and hence default rpc address (say 8020) will handle only client 
requests.) 

By doing this way, you would be able to decouple client and service requests. 
Here service requests corresponds to rpc calls from DN, ZKFC etc. Hence when 
cluster is too busy because of too many client operations, ZKFC requests will 
get processed by different rpc and hence need not wait in same queue as client 
requests.)  

2. ha.health-monitor.rpc-timeout.ms - Also consider increasing zkfc rpc call 
timeout to namenode. 

By default this is 45 secs. You can consider increasing it to 1 or 2 mins 
depending upon your cluster usage.

Thanks,
Chackra 




On Wed, Apr 26, 2017 at 11:50 AM,  ＜[email protected]＞ wrote:


Hi All,



    HDFS HA (Based on QJM) , 5 journalnodes, Apache 2.5.0 on Redhat 6.5 with 
JDK1.7.


    Put 1P+ data into HDFS with FSimage about 10G, then keep on making more 
requests to this HDFS, namenodes failover frequently. Wanna to know something 
as follows:






    1.ANN(active namenode) downloading fsimage.ckpt_* from SNN(standby 
namenode) leads to very high disk io, at the same time, zkfc fails to monitor 
the health of ann due to timeout. Is there any releationship between high disk 
io and zkfc monitor request timeout? Every failover happened when ckpt 
download, but not every ckpt download leads to failover.











2017-03-15 09:27:05,750 WARN org.apache.hadoop.ha.HealthMonitor: 
Transport-level exception trying to monitor health of NameNode at nn1/ip:8020: 
Call From nn1/ip to nn1:8020 failed on socket timeout exception: 
java.net.SocketTimeoutException: 45000 millis timeout while waiting for channel 
to be ready for read. ch : java.nio.channels.SocketChannel[connected 
local=/ip:48536 remote=nn1/ip:8020] For more details see:  
http://wiki.apache.org/hadoop/SocketTimeout


2017-03-15 09:27:05,750 INFO org.apache.hadoop.ha.HealthMonitor: Entering state 
SERVICE_NOT_RESPONDING




    2.Due to SERVICE_NOT_RESPONDING, another zkfc fences the old ann(configed 
sshfence), before restart by my additional monitor, old ann log sometimes shows 
like this, what is "Rescan of postponedMisreplicatedBlocks"? Does this have any 
reletionships with failover?

2017-03-15 04:36:00,866 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: 
Rescanning after 30000 milliseconds

2017-03-15 04:36:00,931 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 
0 directive(s) and 0 block(s) in 65 millisecond(s).

2017-03-15 04:36:01,127 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Rescan of 
postponedMisreplicatedBlocks completed in 23 msecs. 247361 blocks are left. 0 
blocks are removed.

2017-03-15 04:36:04,145 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Rescan of 
postponedMisreplicatedBlocks completed in 17 msecs. 247361 blocks are left. 0 
blocks are removed.

2017-03-15 04:36:07,159 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Rescan of 
postponedMisreplicatedBlocks completed in 14 msecs. 247361 blocks are left. 0 
blocks are removed.

2017-03-15 04:36:10,173 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Rescan of 
postponedMisreplicatedBlocks completed in 14 msecs. 247361 blocks are left. 0 
blocks are removed.

2017-03-15 04:36:13,188 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Rescan of 
postponedMisreplicatedBlocks completed in 14 msecs. 247361 blocks are left. 0 
blocks are removed.

2017-03-15 04:36:16,211 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Rescan of 
postponedMisreplicatedBlocks completed in 23 msecs. 247361 blocks are left. 0 
blocks are removed.

2017-03-15 04:36:19,234 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Rescan of 
postponedMisreplicatedBlocks completed in 22 msecs. 247361 blocks are left. 0 
blocks are removed.

2017-03-15 04:36:28,994 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
STARTUP_MSG:






    3.I config two dfs.namenode.name.dir and one 
dfs.journalnode.edits.dir(which shares one disk with nn), is it suitable? Or 
does this have any disadvantage?








＜property＞

＜name＞dfs.namenode.name.dir.nameservice.nn1＜/name＞

＜value＞/data1/hdfs/dfs/name,/data2/hdfs/dfs/name＜/value＞

＜/property＞

＜property＞

＜name＞dfs.namenode.name.dir.nameservice.nn2＜/name＞

＜value＞/data1/hdfs/dfs/name,/data2/hdfs/dfs/name＜/value＞

＜/property＞








＜property＞

＜name＞dfs.journalnode.edits.dir＜/name＞

＜value＞/data1/hdfs/dfs/journal＜/value＞

＜/property＞


    



    4.Interested in design of checkpoint and edit logs transmission,any 
explanation,issues or documents?







Thanks in advance,

Doris

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: HDFS HA(Based on QJM) Failover Frequently with Large FSimage andBusy Requests

Reply via email to