Re: SolrCloud - distributed architecture considerations

Erick Erickson Sun, 14 Oct 2012 06:40:04 -0700

First, remember that SolrCloud is relatively new, operational issues
like this will doubtless accrue "folk wisdom" as we all gain experience...

But my current thinking is that the remote installations are essentially
completely separate installations with no knowledge of each other. Your
indexing process needs to be smart enough to send _one_ update
request to each cluster. Or perhaps each cluster has its own indexing
process. The point is that the nodes in the remote installations don't
know about each other, so they don't incur the communications lag.

Because I'm pretty sure you're exactly right. The supposition is that
the pipe connecting the remote installations is slow/expensive/whatever
and the chatter that'll go back and forth if they are aware of each other
will be too expensive to really be practical. YMMV, of course.... I can
imagine low-frequency updates where at least the indexing process
wouldn't care, but then you'd still have the querying crossing the
expensive pipe...

As you can tell, I don't have a tried-and-tested answer...

Best
Erick

On Sun, Oct 14, 2012 at 5:41 AM, AlexeyK <lex.kudi...@gmail.com> wrote:
> Hi,
> As far as I understand, SolrCloud eliminates the master-slave specifics, and
> automates both update and search seamlessly.
> What should I take into account configuring SolrCloud for a large customer
> with multiple physical locations?
> I mean, for older Solr I would define master 'close to the data' with batch
> replication to the search server (slave). I would have several such slaves
> for different geographical locations as well.
> How can I ensure (if at all) that search queries do not cross geographical
> boundaries? As far as I understand, SolrCloud routes to any arbitrary active
> replica.
> How can I control the indexing process so that the update request is routed
> to the closest server? If SolrCloud accidentally elects some remote replica
> as a current leader, the indexing process will deteriorate due to networking
> issues; moreover, the update requests will be also bounced back across the
> network as a part of the online replication process.
>
> Do I miss something fundamental in my assumptions/understanding of SolrCloud
> features?
>
> Thanks a lot,
>
> Alexey
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/SolrCloud-distributed-architecture-considerations-tp4013594.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud - distributed architecture considerations

Reply via email to