Re: Correct approach to copy index between solr clouds?

Wei Sat, 26 Aug 2017 18:37:12 -0700

Thanks Erick. Can you explain a bit more on the write.lock file? So far I
have been copying it over from B to A and haven't seen issue starting the
replica.


On Sat, Aug 26, 2017 at 9:25 AM, Erick Erickson <erickerick...@gmail.com>
wrote:

> Approach 2 is sufficient. You do have to insure that you don't copy
> over the write.lock file however as you may not be able to start
> replicas if that's there.
>
> There's a relatively little-known third option. You an (ab)use the
> replication API "fetchindex" command, see:
> https://cwiki.apache.org/confluence/display/solr/Index+Replication to
> pull the index from Cloud B to replicas on Cloud A. That has the
> advantage of working even if you are actively indexing to Cloud B.
> NOTE: currently you cannot _query_ CloudA (the target) while the
> fetchindex is going on, but I doubt you really care since you were
> talking about having Cloud A offline anyway. So for each replica you
> fetch to you'll send the fetchindex command directly to the replica on
> Cloud A and the "masterURL" will be the corresponding replica on Cloud
> B.
>
> Finally, what I'd really do is _only_ have one replica for each shard
> on Cloud A active and fetch to _that_ replica. I'd also delete the
> data dir on all the other replicas for the shard on Cloud A. Then as
> you bring the additional replicas up they'll do a full synch from the
> leader.
>
> FWIW,
> Erick
>
> On Fri, Aug 25, 2017 at 6:53 PM, Wei <weiwan...@gmail.com> wrote:
> > Hi,
> >
> > In our set up there are two solr clouds:
> >
> > Cloud A:  production cloud serves both writes and reads
> >
> > Cloud B:  back up cloud serves only writes
> >
> > Cloud A and B have the same shard configuration.
> >
> > Write requests are sent to both cloud A and B. In certain circumstances
> > when Cloud A's update lags behind,  we want to bulk copy the binary index
> > from B to A.
> >
> > We have tried two approaches:
> >
> > Approach 1.
> >       For cloud A:
> >       a. delete collection to wipe out everything
> >       b. create new collection (data is empty now)
> >       c. shut down solr server
> >       d. copy binary index from cloud B to corresponding shard replicas
> in
> > cloud A
> >       e. start solr server
> >
> > Approach 2.
> >       For cloud A:
> >       a.  shut down solr server
> >       b.  remove the whole 'data' folder under index/  in each replica
> >       c.  copy binary index from cloud B to corresponding shard replicas
> in
> > cloud A
> >       d.  start solr server
> >
> > Is approach 2 sufficient?  I am wondering if delete/recreate collection
> > each time is necessary to get cloud into a "clean" state for copy binary
> > index between solr clouds.
> >
> > Thanks for your advice!
>

Re: Correct approach to copy index between solr clouds?

Reply via email to