On the lengthy TODO list is making SolrCloud nodes "rack aware"
that should help with this, but it's not real high in the priority queue
as I recall. The current architecture sends updates and requests
all over the cluster, so there are lots of messages that go
across the presumably expensive pipe between data centers. Not
to mention the Zookeeper quorum problem.

Hmmm, "Zookeeper Quorum problem". Say 1 ZK node is in DC1
and 2 are in DC2. If DC2 goes down, DC1 will not accept updates
because there is no available ZK quorum. I've seen one proposal
where you use 3 DCs, each with a ZK node to ameliorate this.

But all this is an issue only if the communications link between the
datacenters is "expensive" where that term can mean that it literally
costs more, that it is slow, whatever.

Best
Erick

On Tue, Jun 25, 2013 at 12:14 PM, Otis Gospodnetic
<otis.gospodne...@gmail.com> wrote:
> Uh, I remember that email, but can't recall where we did it.... will
> try to recall it some more and reply if I can manage to dig it out of
> my brain...
>
> Otis
> --
> Solr & ElasticSearch Support -- http://sematext.com/
> Performance Monitoring -- http://sematext.com/spm
>
>
>
> On Tue, Jun 25, 2013 at 2:24 PM, Kevin Osborn <kevin.osb...@cbsi.com> wrote:
>> Otis,
>>
>> I did actually stumble upon this link.
>>
>> http://comments.gmane.org/gmane.comp.jakarta.lucene.solr.user/74870
>>
>> This was from you. You were attempting to replicate data from SolrCloud to
>> some other slaves for heavy-duty queries. You said that you accomplished
>> this. Can you provide a few pointers on how you did this? Thanks.
>>
>>
>> On Tue, Jun 25, 2013 at 10:25 AM, Otis Gospodnetic <
>> otis.gospodne...@gmail.com> wrote:
>>
>>> I think what is needed is a Leader that, while being a Leader for its
>>> own Slice in its local Cluster and Collection (I think I'm using all
>>> the latest terminology correctly here), is at the same time a Replica
>>> of its own Leader counterpart in the "Primary Cluster".
>>>
>>> Not currently possible, AFAIK.
>>> Or maybe there is a better way?
>>>
>>> Otis
>>> --
>>> Solr & ElasticSearch Support -- http://sematext.com/
>>> Performance Monitoring -- http://sematext.com/spm
>>>
>>>
>>>
>>> On Tue, Jun 25, 2013 at 1:07 PM, Kevin Osborn <kevin.osb...@cbsi.com>
>>> wrote:
>>> > We are going to have two datacenters, each with their own SolrCloud and
>>> > ZooKeeper quorums. The end result will be that they should be replicas of
>>> > each other.
>>> >
>>> > One method that has been mentioned is that we should add documents to
>>> each
>>> > cluster separately. For various reasons, this may not be ideal for us.
>>> > Instead, we are playing around with the idea of always indexing to one
>>> > datacenter. And then having that replicate to the other datacenter. And
>>> > this is where I am having some trouble on how to proceed.
>>> >
>>> > The nice thing about SolrCloud is that there is no masters and slaves.
>>> Each
>>> > node is equals, has the same configs, etc. But in this case, I want to
>>> have
>>> > a node in one datacenter poll for changes in another data center. Before
>>> > SolrCloud, I would have used slave/master replication. But in the
>>> SolrCloud
>>> > world, I am not sure how to configure this setup?
>>> >
>>> > Or is there any better ideas on how to use replication to push or pull
>>> data
>>> > from one datacenter to another?
>>> >
>>> > In my case, NRT is not a requirement. And I will also be dealing with
>>> about
>>> > 3 collections and 5 or 6 shards.
>>> >
>>> > Thanks.
>>> >
>>> > --
>>> > *KEVIN OSBORN*
>>> > LEAD SOFTWARE ENGINEER
>>> > CNET Content Solutions
>>> > OFFICE 949.399.8714
>>> > CELL 949.310.4677      SKYPE osbornk
>>> > 5 Park Plaza, Suite 600, Irvine, CA 92614
>>> > [image: CNET Content Solutions]
>>>
>>
>>
>>
>> --
>> *KEVIN OSBORN*
>> LEAD SOFTWARE ENGINEER
>> CNET Content Solutions
>> OFFICE 949.399.8714
>> CELL 949.310.4677      SKYPE osbornk
>> 5 Park Plaza, Suite 600, Irvine, CA 92614
>> [image: CNET Content Solutions]

Reply via email to