Toby Lazar wrote > Unless Solr is your system of record, aren't you already replicating your > source data across the WAN? If so, could you load Solr in colo B from > your colo B data source? You may be duplicating some indexing work, but > at least your colo B Solr would be more closely in sync with your colo B > data.
Our system of record exists in a SQL DB that is indeed replicated via always-on mirroring to the failover data center. However, a complete forced re-index of all of the data could take hours and our SLA requires us to be back up with searchable indices in minutes. Because we may have to replicate multiple data centers' data (three plus data centers, A, B and the failover DC) into this failover data center, we can't dedicate the failover data center's SolrCloud to constantly re-index data from a single SQL mirror when we could potentially need it to take over for any given one. One thought we had was to have a situation where the DCs A and B would run a cron job that would force a backup of the indices using the "replication?command=backup" API command and then we would sync up those backup snapshots to the failover DC's shut down SorCloud instance to a separate filesystem directory dedicated to DC A's or DC B's indices. Then in the case of a failover we would have to run a script that would symlink the snapshots for the particular DC we want to failover for to the index dir for the failover DCs SolrCloud and then start up the nodes. The problem comes with how to handle different indices on different nodes in the SolrCloud then we have 2 shards. We would have to do a 1:1 copy of each of the four nodes in DCs A and B to each of the other node in the failover DC. Sounds pretty ugly. Looking at this thread, even this paln may not work: http://lucene.472066.n3.nabble.com/solrcloud-shards-backup-restoration-td4088447.html As far as the SolrEntityProcessor, I'm not sure how you would configure it. >From what I gather, you have to configure a new requestHandler section in your Solrconfig.xml like this: <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler"> <lst name="defaults"> <str name="config">/data/solr/mysolr/conf/data-config.xml</str> </lst> </requestHandler> And then you have to configure a "/data/solr/mysolr/conf/data-config.xml" with the following contents: <dataConfig> <document> <entity name="sep" processor="SolrEntityProcessor" url="http://solrsource.example.com:8983/solr/" query="*:*"/> </document> </dataConfig> However, this doesn't seem to work for me as I'm using a SolrCloud with zookeeper. I created these files in my conf directory and uploaded them to zookeeper, then reloaded the collection/cores but all I got were initialization errors. I don't think the docs assume you'll be doing this under a SolrCloud scenario. Any other insight? -- View this message in context: http://lucene.472066.n3.nabble.com/Replicating-Between-Solr-Clouds-tp4121196p4121685.html Sent from the Solr - User mailing list archive at Nabble.com.