namenode isn't saving edits after temporary disk problem

Whitney Jackson Mon, 09 Aug 2021 21:10:46 -0700

Hi,

My cluster is up and running after its two namenodes ran out of disk space.
It's mostly happy except that the currently active namenode isn't recording
edits to disk. I don't see any modifications to the edits_inprogress file
and no new fsimage files are being recorded.


If I enter safe mode and try to run "hdfs dfsadmin -saveNamespace" I get:

java.io.IOException: No image directories available!

        at
org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImageInAllDirs(FSImage.java:1218)

        at
org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:1162)

        at
org.apache.hadoop.hdfs.server.namenode.FSImage.saveNamespace(FSImage.java:1132)

        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveNamespace(FSNamesystem.java:4494)

        at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.saveNamespace(NameNodeRpcServer.java:1270)

        at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.saveNamespace(ClientNamenodeProtocolServerSideTranslatorPB.java:873)

        at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)

        at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)

        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)

        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:999)

        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:927)

        at java.base/java.security.AccessController.doPrivileged(Native
Method)

        at java.base/javax.security.auth.Subject.doAs(Subject.java:423)

        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)

        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2915)

The standby namenode won't start and attempts to bootstrap it fail with:
"Could not find image with txid ..."

I'm concerned that if I restart the running namenode all edits since it
stopped recording to disk will be lost.

Is there anything I can do to resolve?

Thanks,

Whitney

namenode isn't saving edits after temporary disk problem

Reply via email to