Hi,
My cluster is up and running after its two namenodes ran out of disk space.
It's mostly happy except that the currently active namenode isn't recording
edits to disk. I don't see any modifications to the edits_inprogress file
and no new fsimage files are being recorded.
If I enter safe mode and try to run "hdfs dfsadmin -saveNamespace" I get:
java.io.IOException: No image directories available!
at
org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImageInAllDirs(FSImage.java:1218)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:1162)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:1132)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveNamespace(FSNamesystem.java:4494)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.saveNamespace(NameNodeRpcServer.java:1270)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.saveNamespace(ClientNamenodeProtocolServerSideTranslatorPB.java:873)
at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:999)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:927)
at java.base/java.security.AccessController.doPrivileged(Native
Method)
at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2915)
The standby namenode won't start and attempts to bootstrap it fail with:
"Could not find image with txid ..."
I'm concerned that if I restart the running namenode all edits since it
stopped recording to disk will be lost.
Is there anything I can do to resolve?
Thanks,
Whitney