Thank you very much Otis, regular old Solr
distribute search was the piece I was missing. Now it's hands on time!
--
Rui
Hi Rui,
You don't need to merge the resulting indices (1 index per Reducer is
what I assume you are asking about). Each could be copied to a
different Solr server and you could then use regular old Solr
distributed search to search across them.
You don't want to search indices while they are in
Thank you very much for your replies,
Yes Otis one possibility is to copy my data do HDFS and then apply a Map
function
to create the intermediate indexes across the cluster using SOLR java
library in HDFS.
I have some doubts concerning this solution:
1 - The int
You may also want take a look at the DataStax Enterprise product which
combines Cassandra, Solr, and Hadoop.
See:
http://www.datastax.com/products/enterprise
-- Jack Krupansky
-Original Message-
From: Rui Vaz
Sent: Friday, October 12, 2012 2:35 PM
To: solr-user@lucene.apache.org
Subj
Hi Rui,
If you're going to shard and/or replicate your index, then be sure to take
a look at CloudSolrServer in the SolrJ client library. CloudSolrServer is
an extension to SolrServer that works with Zookeeper to understand the
shards and replicas in a Solr cluster. Using CloudSolrServer, there is
Hello Rui,
If your data to be indexed is in HDFS, using MapReduce to parallelize
indexing is still a good idea.
Otis
--
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html
On Fri, Oct 12, 2012 at 2:35 PM, Rui Vaz wrote: