Updating two systems in parallel gets into two-phase commit, instantly. So you 
need a persistent pool of updates that both clusters pull from.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Feb 9, 2016, at 4:15 PM, Shawn Heisey <apa...@elyograg.org> wrote:
> 
> On 2/9/2016 1:43 PM, tedsolr wrote:
>> I expect that rsync can be used initially to copy the collection data
>> folders and the zookeeper data and transaction log folders. So after
>> verifying Solr/ZK is functional after the install, shut it down and perform
>> the copy. This may sound slow but my production index size is < 100GB. Is
>> this approach reasonable?
>> 
>> So now to keep the warm site in sync, I could use rsync on a scheduled basis
>> but I assume there's a better way. The ref guide says to send all indexing
>> requests to the second cluster at the same time they are sent to the active
>> cluster. I use SolrJ for all requests. So would this entail using a second
>> CloudSolrClient instance that only knows about the second cluster? Seems
>> reasonable but I don't want to lengthen the response time for the users. Is
>> this just a software problem to work out (separate thread)? Or is there a
>> SolrJ solution (asyc calls)?
> 
> The way I would personally handle keeping both systems in sync at the
> moment would be to modify my indexing system to update both systems in
> parallel.  That likely would involve a second CloudSolrClient instance.
> 
> There's a new feature called "Cross Data Center Replication" but as far
> as I know, it is only available in development versions, and has not
> been made available in any released version of Solr.
> 
> http://yonik.com/solr-cross-data-center-replication/
> 
> This new feature may become available in 6.0 or a later 6.x release.  I
> do not have any concrete information about the expected release date for
> 6.0.
> 
> Thanks,
> Shawn
> 

Reply via email to