Hi,
We sometimes see tasks failing with the exception below. There are no network
issues and the domainname resolves normally. Also, all nodes have a local DNS
caching daemon running. Any idea why we see this error? It usually happens
when there is more than one job running on the cluster.
We could, of course, add all nodes in /etc/hosts but i prefer not.
java.net.UnknownHostException: unknown host: namenode
at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:214)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1192)
at org.apache.hadoop.ipc.Client.call(Client.java:1046)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at $Proxy2.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
at
org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:118)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:222)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:187)
at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1328)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:65)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1346)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:244)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:122)
at org.apache.hadoop.mapred.Child$4.run(Child.java:254)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Thanks