On 8/28/2013 6:13 AM, Daniel Collins wrote: > We have 2 separate data centers in our organisation, and in order to > maintain the ZK quorum during any DC outage, we have 2 separate Solr > clouds, one in each DC with separate ZK ensembles but both are fed with the > same indexing data. > > Now in the event of a DC outage, all our Solr instances go down, and when > they come back up, we need some way to recover the "lost" data. > > Our thought was to replicate from the working DC, but is there a way to do > that whilst still maintaining an "online" presence for indexing purposes?
One way which would work (if your core name structures were identical between the two clouds) would be to shut down your indexing process, shut down the cloud that went down and has now come back up, and rsync from the good cloud. Depending on the index size, that could take a long time, and the index updates would be turned off while it's happening. That makes this idea less than ideal. I have a similar setup on a sharded index that's NOT using SolrCloud, and both copies are in one location instead of two separate data centers. My general indexing method would work for your setup, though. The way that I handle this is that my indexing program tracks its update position for each copy of the index independently. If one copy is down, the tracked position for that index won't get updated, so the next time it comes up, all missed updates will get done for that copy. In the meantime, the program (Java, using SolrJ) is happily using a separate thread to continue updating the index copy that's still up. Thanks, Shawn