Re: replicate indexing to second site

Shawn Heisey Tue, 09 Feb 2016 16:23:00 -0800

On 2/9/2016 1:43 PM, tedsolr wrote:
> I expect that rsync can be used initially to copy the collection data
> folders and the zookeeper data and transaction log folders. So after
> verifying Solr/ZK is functional after the install, shut it down and perform
> the copy. This may sound slow but my production index size is < 100GB. Is
> this approach reasonable?
>
> So now to keep the warm site in sync, I could use rsync on a scheduled basis
> but I assume there's a better way. The ref guide says to send all indexing
> requests to the second cluster at the same time they are sent to the active
> cluster. I use SolrJ for all requests. So would this entail using a second
> CloudSolrClient instance that only knows about the second cluster? Seems
> reasonable but I don't want to lengthen the response time for the users. Is
> this just a software problem to work out (separate thread)? Or is there a
> SolrJ solution (asyc calls)?


The way I would personally handle keeping both systems in sync at the
moment would be to modify my indexing system to update both systems in
parallel.  That likely would involve a second CloudSolrClient instance.

There's a new feature called "Cross Data Center Replication" but as far
as I know, it is only available in development versions, and has not
been made available in any released version of Solr.

http://yonik.com/solr-cross-data-center-replication/

This new feature may become available in 6.0 or a later 6.x release.  I
do not have any concrete information about the expected release date for
6.0.

Thanks,
Shawn

Re: replicate indexing to second site

Reply via email to