Re: external indexer for Solr Cloud

Jack Krupansky Mon, 01 Sep 2014 06:20:07 -0700

Okay, but please clarify further - do you simply wish to run DIH externally,but still sending each document to SolrCloud for indexing, or... are youexpecting to generate the index completely external to the cluster and thensomehow "merge" that DIH "index" into the SolrCloud index?

It would be great to have a "standalone DIH" that runs as a separate serverand then sends standard Solr update requests to a Solr cluster.


-- Jack Krupansky

-----Original Message-----From: Lee Chunki

Sent: Sunday, August 31, 2014 8:55 PM
To: solr-user@lucene.apache.org
Subject: Re: external indexer for Solr Cloud

Hi Shawn and Jack,

Thank you for your reply.

Yes, I want to run data import hander independently and sync it to SolrCloud.because current my DIH node do not only DB fetch & join but also manypreprocessing.


Thanks,
Chunki.


On Aug 30, 2014, at 1:34 AM, Jack Krupansky <j...@basetechnology.com> wrote:

My other thought was that maybe he wants to do index updates outside ofthe cluster that is handling queries, and then copy in the completedindex. Or... maybe take replicas out of the query rotation while they areupdated. Or... maybe this is yet another X-Y problem!


-- Jack Krupansky

-----Original Message----- From: Shawn Heisey
Sent: Friday, August 29, 2014 11:19 AM
To: solr-user@lucene.apache.org
Subject: Re: external indexer for Solr Cloud

On 8/29/2014 5:21 AM, Lee Chunki wrote:

Is there any way to run external indexer for solar cloud?


Jack asked an excellent question.  What do you mean by this?  Unless
you're using the dataimport handler, all indexing is external to Solr.

my situation is :

* running two indexer ( for fail over ) and two searcher.
* just use two searcher for service.
* have plan to move on Solr Cloud
however I wonder that if I run indexing job on one of the solr cloudserver, the server’s load would be higher than other nodes.
so, I want to build index out of sold cloud but….


In SolrCloud, every shard replica will be indexing -- it's not like
old-style replication, where the master indexes everything and the
slaves copy the completed index.  The leader of each shard will be
working slightly harder than the other replicas, but you really don't
need to worry too much about sending all your updates to one server --
those requests get duplicated to the other servers and they all index
them, almost in parallel.

For my setup (non-cloud, but sharded), I use Pacemaker to ensure that
only one of my servers is running my indexing program and haproxy (plus
its shared IP address).

Thanks,
Shawn

Re: external indexer for Solr Cloud

Reply via email to