Good day, I've been working to test Solr 4.4 in our dev environment with the HDFS integration that was just announced and am having some issues getting NameNode HA to work. To start off with I had to change out all of the Hadoop jars in WEB-INF/lib/ with the matching jars from our Hadoop distribution (2.0.0-cdh4.2.0). Once that was complete I was able to get Solr working properly with the data and index directories in HDFS while connecting to the active NameNode. Upon changing the configuration to use HA I'm receiving a UnknownHostException that I believe I shouldn't be.
Here's the stack trace: ** 6537 [coreLoadExecutor-4-thread-1] ERROR org.apache.solr.core.CoreContainer - Unable to create core: core0 org.apache.solr.common.SolrException: java.net.UnknownHostException: nameservice1 at org.apache.solr.core.SolrCore.<init>(SolrCore.java:835) at org.apache.solr.core.SolrCore.<init>(SolrCore.java:629) at org.apache.solr.core.ZkContainer.createFromZk(ZkContainer.java:270) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:655) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:364) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:356) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.IllegalArgumentException: java.net.UnknownHostException: nameservice1 at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:414) at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164) at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:436) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:403) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:125) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2262) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:86) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2296) at org.apache.hadoop.fs.FileSystem$Cache.getUnique(FileSystem.java:2284) at org.apache.hadoop.fs.FileSystem.newInstance(FileSystem.java:362) at org.apache.solr.store.hdfs.HdfsDirectory.<init>(HdfsDirectory.java:59) at org.apache.solr.core.HdfsDirectoryFactory.create(HdfsDirectoryFactory.java:154) at org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:350) at org.apache.solr.core.SolrCore.getNewIndexDir(SolrCore.java:256) at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:469) at org.apache.solr.core.SolrCore.<init>(SolrCore.java:759) ... 13 more Caused by: java.net.UnknownHostException: nameservice1 ... 30 more 6541 [coreLoadExecutor-4-thread-1] ERROR org.apache.solr.core.CoreContainer - null:org.apache.solr.common.SolrException: Unable to create core: core0 at org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:1150) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:666) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:364) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:356) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: org.apache.solr.common.SolrException: java.net.UnknownHostException: nameservice1 at org.apache.solr.core.SolrCore.<init>(SolrCore.java:835) at org.apache.solr.core.SolrCore.<init>(SolrCore.java:629) at org.apache.solr.core.ZkContainer.createFromZk(ZkContainer.java:270) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:655) ... 10 more Caused by: java.lang.IllegalArgumentException: java.net.UnknownHostException: nameservice1 at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:414) at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164) at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:436) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:403) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:125) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2262) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:86) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2296) at org.apache.hadoop.fs.FileSystem$Cache.getUnique(FileSystem.java:2284) at org.apache.hadoop.fs.FileSystem.newInstance(FileSystem.java:362) at org.apache.solr.store.hdfs.HdfsDirectory.<init>(HdfsDirectory.java:59) at org.apache.solr.core.HdfsDirectoryFactory.create(HdfsDirectoryFactory.java:154) at org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:350) at org.apache.solr.core.SolrCore.getNewIndexDir(SolrCore.java:256) at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:469) at org.apache.solr.core.SolrCore.<init>(SolrCore.java:759) ... 13 more Caused by: java.net.UnknownHostException: nameservice1 ... 30 more ** Now, I've started looking into the code to figure out why that's happening and haven't been able to pinpoint it. HdfsDirectory creates a new FileSystem object with the Configuration I can only assume is being passed from the solr.hdfs.confdir system property. FileSystem goes through some methods and eventually ends up at NameNodeProxies while trying to talk to the NameNode(s) and eventually ends up looking for DFS_CLIENT_FAILOVER_PROXY_PROVIDER_KEY_PREFIX (dfs.client.failover.proxy.provider) and the "host name" of the logical name node. Given that I have this information provided on the command line: ** java -jar start.jar -DzkHost=dev-hadoop02.xio.stl:2181,dev-hadoop03.xio.stl:2181,dev-hadoop04.xio.stl:2181 -DnumShards=4 -Dbootstrap_conf=true -Dsolr.hdfs.confdir=/etc/hadoop/conf.cloudera.hdfs1 ** Or in the client configuration files: ** # grep -B1 -A2 dfs.client.fail hdfs-site.xml <property> <name>dfs.client.failover.proxy.provider.nameservice1</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> ** How do I get HA working? Thanks, Greg Greg Walters | Operations Team 530 Maryville Center Drive, Suite 250 St. Louis, Missouri 63141 t. 314.225.2745 | c. 314.225.2797 gwalt...@sherpaanalytics.com www.sherpaanalytics.com