Hi. hadooper!
I use hadoop-2.7.1 and my cluster has 130 nodes.
recently I am facing a problem.
I have found corrupt block by nagios.
nagios request
http://namenode01:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem.
below is result. notice CorruptBlocks is 1.
{
"beans" : [ {
"name" : "Hadoop:service=NameNode,name=FSNamesystem",
"modelerType" : "FSNamesystem",
"tag.Context" : "dfs",
"tag.HAState" : "active",
"tag.Hostname" : "css0700.nhnsystem.com",
"MissingBlocks" : 0,
"MissingReplOneBlocks" : 0,
"ExpiredHeartbeats" : 10,
"TransactionsSinceLastCheckpoint" : 820630,
"TransactionsSinceLastLogRoll" : 1916,
"LastWrittenTransactionId" : 376685578,
"LastCheckpointTime" : 1452583950883,
"CapacityTotal" : 1650893130075660,
"CapacityTotalGB" : 1537514.0,
"CapacityUsed" : 1237079990848257,
"CapacityUsedGB" : 1152121.0,
"CapacityRemaining" : 410189981364473,
"CapacityRemainingGB" : 382019.0,
"CapacityUsedNonDFS" : 3623157862930,
"TotalLoad" : 6717,
"SnapshottableDirectories" : 0,
"Snapshots" : 0,
"BlocksTotal" : 4034155,
"FilesTotal" : 2866690,
"PendingReplicationBlocks" : 0,
"UnderReplicatedBlocks" : 0,
"CorruptBlocks" : 1,
"ScheduledReplicationBlocks" : 0,
"PendingDeletionBlocks" : 0,
"ExcessBlocks" : 87,
"PostponedMisreplicatedBlocks" : 0,
"PendingDataNodeMessageCount" : 0,
"MillisSinceLastLoadedEdits" : 0,
"BlockCapacity" : 67108864,
"StaleDataNodes" : 0,
"TotalFiles" : 2866690
} ]
}
however 'hdfs fsck / -list-corruptfileblocks' does not found.
below is fsck result.
The filesystem under path '/' has 0 CORRUPT files
'hdfs dfsadmin -report' is similar first result.
Configured Capacity: 1650892318461452 (1.47 PB)
Present Capacity: 1647353181258422 (1.46 PB)
DFS Remaining: 408711410865856 (371.72 TB)
DFS Used: 1238641770392566 (1.10 PB)
DFS Used%: 75.19%
Under replicated blocks: 0
Blocks with corrupt replicas: 1
Missing blocks: 0
Missing blocks (with replication factor 1): 0
My question is
1. why different result?
2. How do I find corrupt filename? I wonder which file is corrupt.
Thank you.