Some of my hive queries failing into this exception:
Failure while running task:java.lang.RuntimeException:
java.lang.RuntimeException: java.io.IOException: java.io.IOException:
Cannot obtain block length for
LocatedBlock{BP-2028945280-10.35.63.1-1481200988158:blk_1088997479_15461734;
getBlockSize()=6980; corrupt=false; offset=0; locs=
» DatanodeInfoWithStorage
» 10.35.63.20:50010,DS-885ec97e-e948-457f-9ea9-ebc923b487d5,DISK
, DatanodeInfoWithStorage
» 10.35.63.43:50010,DS-bae7e35e-fe36-4aa8-9aad-caffbe89736c,DISK
, DatanodeInfoWithStorage
» 10.35.63.42:50010,DS-e38d612d-1b93-4aed-b490-3a133656ecc5,DISK
}
I've found problem block on 10.35.63.20 data node:
find /data/ -name blk_1088997479*
/data/7/hadoop/hdfs/data/current/BP-2028945280-10.35.63.1-1481200988158/current/rbw/blk_1088997479
/data/7/hadoop/hdfs/data/current/BP-2028945280-10.35.63.1-1481200988158/current/rbw/blk_1088997479_15461734.meta
fsck shows system is healthy:
Total size: 48329290532734 B (Total open files size: 386901241354 B)
Total dirs: 290047
Total files: 417880
Total symlinks: 0 (Files currently being written: 3765)
Total blocks (validated): 583003 (avg. block size 82897155 B) (Total
open file blocks (not validated): 5003)
Minimally replicated blocks: 583003 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 3.001897
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 30
Number of racks: 3
FSCK ended at Fri May 26 13:03:30 MSK 2017 in 6845 milliseconds
The filesystem under path '/' is HEALTHY
And I can't find a file associated with this block.
Execution of 'hadoop fsck / -files -blocks | grep blk_1088997479' returns
nothing.
Any advice what should I do next to localize the problem?