Hello, I am evaluating solr 4.2 and ElasticSearch (I am new to both) for a search API, where data sits in cassandra.
Getting started with elasticsearch is pretty straight forward and I was able to write an ES "river<http://www.elasticsearch.org/guide/reference/river/>" which pulls data from cassandra and indexes it in ES within a day. Now, I trying to implement something similar with solr and compare both of them. Getting started with solr/example<http://lucene.apache.org/solr/4_2_0/tutorial.html>was pretty easy and an example solr instance works. But the example folder contains whole bunch of stuff which I am not sure if I need: http://pastebin.com/Gv660mRT . I am sure I don't need 53 directories and 527 files So my questions are: 1. How can I create a bare bone solr app up and running with minimum set of configuration? (I will build over it when needed by taking reference from /example) 2. What is a best practice to run solr in production? Am approach like this jetty+nginx recommended: http://sacharya.com/nginx-proxy-to-jetty-for-java-apps/ ? Once I am done setting up a simple solr instance: 3. What is the general practice to import data to solr? For now, I am writing a python script which will read data in bulk from cassandra and throw it to solr. -- Thanks, -Utkarsh