If you do end up figuring it out, would you mind letting me know? Right
now, our solution is to use an older version of SolrJ, but that means we
miss out on some of the improvements/bugfixes around aliases.
Thanks,
Michael Della Bitta
Applications Developer
o: +1 646 532 3062 | c: +1 917 477 7
Michael,
We replaced Lucene jars but run into a problem with incompatible version of
Apache HttpComponents. Still figuring it out.
Dmitriy
--
View this message in context:
http://lucene.472066.n3.nabble.com/Problem-running-Solr-indexing-in-Amazon-EMR-tp4083636p4084121.html
Sent from the Solr
hi Dmitriy,
Just out of curiosity, have you tried replacing the Lucene jars with a
bootstrap action?
Michael Della Bitta
Applications Developer
o: +1 646 532 3062 | c: +1 917 477 7906
appinions inc.
“The Science of Influence Marketing”
18 East 41st Street
New York, NY 10017
t: @appinions
Michael,
Amazon Hadoop distribution has Lucene 2.9.4 jars in /lib directory and they
conflict with Solr 4.4 we are using. Once we pass that problem we run into
conflict with Apache HttpComponents you describe. I think the best bet would
be for us to build our own AMI to avoid these dependencies.
Dmitriy,
I don't believe that EMR does include Solr or Lucene in their EMR AMIs. But
there was a recent AMI update that ruined some things for us. Have you
tried using an older AMI?
One headache for us has been that the EMR AMI uses an older version of
Apache HttpComponents than that of Solr 4.3,
Erick,
It actually suppose to be just one version of Solr that is bundled with our
map/reduce jar. To be clear: Map/Reduce job is generating a new index, not
reading an existing one. But it fails even before as an instance of
EmbeddedSolrServer is created at the first line of the following code.
Have you checked the luceneMatchVersion in all your solrconfig.xml
files? I'm guessing it't set to 40 somewhere in the process as
evidenced by the line:
org.apache.lucene.codecs.lucene40.Lucene40FieldInfosFormat.(
Lucene40FieldInfosFormat.java:99)
so it looks like somehow a Lucene 4.0 codec is bein
Erick,
Thank you for the reply. Cloudera image includes Solr 4.3. I'm not sure what
version Amazon EMR includes. We are not directly referencing or using their
version of Solr but instead build our jar against Solr 4.4 and include all
dependencies in our jar file. Also error occurs not while read
What version of Solr is Cloudera's CDH built on? Looks to me like
the Solr you're using to read the M/R produced index is different
than the one used to build it. Or the version specified in the
Solr configs, evidenced by the LUCENE40 in the error
message. See in solrconfig.xml.
But probably a be