Re: Data Centre recovery/replication, does this seem plausible?

Shawn Heisey Wed, 28 Aug 2013 10:14:34 -0700

On 8/28/2013 10:48 AM, Daniel Collins wrote:

What ideally I would like to do
is at the point that I kick off recovery, divert the indexing feed for the
"broken" into a transaction log on those machines, run the replication and
swap the index in, then replay the transaction log to bring it all up to
date.  That process (conceptually)  is the same as the
org.apache.solr.cloud.RecoveryStrategy code.

I don't think any such mechanism exists currently. It would beextremely awesome if it did. If there's not an existing Jira issue, Irecommend that you file one. Being able to set up a multi-datacentercloud with automatic recovery would be awesome. Even if it took a longtime, having it be fully automated would be exceptionally useful.

Yes, if I could divert that feed a that application level, then I can do
what you suggest, but it feels like more work to do that (and build an
external transaction log) whereas the code seems to already be in Solr
itself, I just need to hook it all up (famous last words!) Our indexing
pipeline does a lot of pre-processing work (its not just pulling data from
a database), and since we are only talking about the time taken to do the
replication (should be an hour or less), it feels like we ought to be able
to store that in a Solr transaction log (i.e. the last point in the
indexing pipeline).

I think it would have to be a separate transaction log. One problemwith really big regular tlogs is that when Solr gets restarted, theentire transaction log that's currently on the disk gets replayed. Ifit were big enough to recover the last several hours to a duplicatecloud, it would take forever to replay on Solr restart. If the regulartlog were kept small but a second log with the last 24 hours wereavailable, it could replay updates when the second cloud came back up.

I do import from a database, so the application-level tracking worksreally well for me.


Thanks,
Shawn

Re: Data Centre recovery/replication, does this seem plausible?

Reply via email to