I recently restarted my small cluster (1 namenode, 1 job tracker, 1 secondary nn, 6 compute/data nodes). However the namenode refuses to start up due to a NPE.
After googling I saw some suggestions of doing a printf "\xff\xff\xff\xee\xff" > edits, however this did not fix the problem. Any ideas how to get this back up? I can clearly remove the edits.new file and it works, but then I have lost data. -rw-rw-r-- 1 hadoop hadoop 5 Aug 7 19:33 edits -rw-rw-r-- 1 hadoop hadoop 1049088 Aug 7 19:33 edits.new -rw-rw-r-- 1 hadoop hadoop 61621 Aug 7 19:33 fsimage -rw-rw-r-- 1 hadoop hadoop 8 Aug 7 19:33 fstime -rw-rw-r-- 1 hadoop hadoop 112 Aug 7 19:33 VERSION 2012-08-07 19:37:08,187 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = XXXXXXXXXXXX STARTUP_MSG: args = [] STARTUP_MSG: version = 1.0.3 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1335192; compiled by 'hortonfo' on Tue May 8 20:31:25 UTC 2012 ************************************************************/ 2012-08-07 19:37:08,404 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2012-08-07 19:37:08,416 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered. 2012-08-07 19:37:08,417 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2012-08-07 19:37:08,417 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system started 2012-08-07 19:37:08,750 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered. 2012-08-07 19:37:08,775 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists! 2012-08-07 19:37:08,811 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered. 2012-08-07 19:37:08,830 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source NameNode registered. 2012-08-07 19:37:08,849 INFO org.apache.hadoop.hdfs.util.GSet: VM type = 32-bit 2012-08-07 19:37:08,849 INFO org.apache.hadoop.hdfs.util.GSet: 2% max memory = 19.33375 MB 2012-08-07 19:37:08,852 INFO org.apache.hadoop.hdfs.util.GSet: capacity = 2^22 = 4194304 entries 2012-08-07 19:37:08,852 INFO org.apache.hadoop.hdfs.util.GSet: recommended=4194304, actual=4194304 2012-08-07 19:37:08,985 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=hadoop 2012-08-07 19:37:08,985 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup 2012-08-07 19:37:08,985 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true 2012-08-07 19:37:08,990 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.block.invalidate.limit=100 2012-08-07 19:37:08,990 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s) 2012-08-07 19:37:09,004 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStateMBean and NameNodeMXBean 2012-08-07 19:37:09,028 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names occuring more than 10 times 2012-08-07 19:37:09,037 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files = 560 2012-08-07 19:37:09,098 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files under construction = 1 2012-08-07 19:37:09,099 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 61621 loaded in 0 seconds. 2012-08-07 19:37:09,100 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Invalid opcode, reached end of edit log Number of transactions found 0 2012-08-07 19:37:09,100 INFO org.apache.hadoop.hdfs.server.common.Storage: Edits file /hadoop/hadoop-1.0.3/namenode-data/current/edits of size 5 edits # 0 loaded in 0 seconds. 2012-08-07 19:37:09,117 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedSetTimes(FSDirectory.java:1351) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedSetTimes(FSDirectory.java:1340) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:790) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1026) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:839) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:377) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:388) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:362) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:496) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1288) 2012-08-07 19:37:09,118 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at XXXXXXXXXXXX ************************************************************/
