MS Azure does not support Solr 4.9 on HDI, so I am posting here. I would
like to write index collection data to HDFS (hosted on ADL).

Note: I am able to get to ADL from hadoop fs command like, so hadoop is
configured correctly to get to ADL:
hadoop fs -ls adl://

This is what I have done so far:
1. Copied all required jars to sol ext lib folder:
sudo cp -f /usr/hdp/current/hadoop-client/*.jar
/usr/hdp/current/solr/example/lib/ext
sudo cp -f /usr/hdp/current/hadoop-client/lib/*.jar
/usr/hdp/current/solr/example/lib/ext
sudo cp -f /usr/hdp/current/hadoop-hdfs-client/*.jar
/usr/hdp/current/solr/example/lib/ext
sudo cp -f /usr/hdp/current/hadoop-hdfs-client/lib/*.jar
/usr/hdp/current/solr/example/lib/ext
sudo cp -f
/usr/hdp/current/storm-client/contrib/storm-hbase/storm-hbase*.jar
/usr/hdp/current/solr/example/lib/ext
sudo cp -f /usr/hdp/current/phoenix-client/lib/phoenix*.jar
/usr/hdp/current/solr/example/lib/ext
sudo cp -f /usr/hdp/current/hbase-client/lib/hbase*.jar
/usr/hdp/current/solr/example/lib/ext

This includes the Azure active data lake jars also.

2. Edited my solr-config.xml file for my collection:

<dataDir>${solr.core.name}/data/</dataDir>

<directoryFactory name="DirectoryFactory" class="solr.HdfsDirectoryFactory">
  <str 
name="solr.hdfs.home">adl://esodevdleus2.azuredatalakestore.net/clusters/esohadoopdeveus2/solr/</str>
  <str name="solr.hdfs.confdir">/usr/hdp/2.6.2.25-1/hadoop/conf</str>
  <str 
name="solr.hdfs.blockcache.global">${solr.hdfs.blockcache.global:true}</str>
  <bool name="solr.hdfs.blockcache.enabled">true</bool>
  <int name="solr.hdfs.blockcache.slab.count">1</int>
  <bool name="solr.hdfs.blockcache.direct.memory.allocation">true</bool>
  <int name="solr.hdfs.blockcache.blocksperbank">16384</int>
  <bool name="solr.hdfs.blockcache.read.enabled">true</bool>
  <bool name="solr.hdfs.nrtcachingdirectory.enable">true</bool>
  <int name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">16</int>
</directoryFactory>


When this collection is deployed to solr, I see this error message:

<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">2189</int></lst>
<lst name="failure">
<str>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
CREATEing SolrCore 'ems-collection_shard2_replica2':
Unable to create core: ems-collection_shard2_replica2 Caused by: Class
org.apache.hadoop.fs.adl.HdiAdlFileSystem not
found</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
CREATEing SolrCore 'ems-collection_shard2_replica1': Unable to create
core: ems-collection_shard2_replica1 Caused by: Class
org.apache.hadoop.fs.adl.HdiAdlFileSystem not
found</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
CREATEing SolrCore 'ems-collection_shard1_replica1': Unable to create
core: ems-collection_shard1_replica1 Caused by: Class
org.apache.hadoop.fs.adl.HdiAdlFileSystem not
found</str><str>org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error
CREATEing SolrCore 'ems-collection_shard1_replica2': Unable to create
core: ems-collection_shard1_replica2 Caused by: Class
org.apache.hadoop.fs.adl.HdiAdlFileSystem not found</str>
</lst>
</response>


Has anyone done this and can help me out?

Thanks,

Abhi


-- 
Abhi Basu

Reply via email to