First, perhaps the slickest way to reindex without as much downtime would be to just index to a _new_ collection. Then use "collection aliasing" to point incoming requests to the old collection to the new one. True, you do need extra hardware....
But that aside, Solr (well Lucene really) indexes are just files. There's a collection-wide backup restore but check the PDF for your Solr version to see if it's available to you. Beyond that, just copy things around. So here's a process, modify as you see fit: 1> index to your new collection in region 1 2> in region 2, create a new collection with the same number of shards (no followers, leader-only). 3> with the Solr instances in region 2 down, copy the data dir from your servers in region 1 to the corresponding data dir on your severs in region 2. It is _very_ important that the hash ranges match. If you look at your state.json you'll see an entry for each shard like "hash_range 0x8000000-0xffffffff. The hash range on the source must match exactly the hash range on dest in region 2. Double check this as you basically copy from collection_shard1_replica1...data(on region 1)/data to collection_shard1_replica1...data on region 2. 4> Once this is done for all shards, bring up Solr on region 2 and verify it's as you expect. 5> Use the Collections API to ADDREPLICA in region 2 to build out your collection. the ADDREPLICA will automatically copy the index from the leader. Best, Erick On Fri, Feb 10, 2017 at 10:12 AM, Kelly, Frank <frank.ke...@here.com> wrote: > Hello, > > We have a 100M+ documents across 2 collections and need to reindex the > entirety of the Collections as we need to turn on “docValues”:true on a > number of fields (see previous emails from this week :-] ). > Unfortunately we have 4 AWS regions each with their own SolrCloud cluster > each with its own copy of the entire search index. > So we have to do this reindex 4 times and in each case we have to take > down each region as we need to delete the collection. And reindexing takes > about 2-3 days. > > Is there someway we can reindex in one (offline) region and then use some > mechanism - replication? Backup/restore? EBS snapshot? to “copy and paste” > a known Solr state from one SolrCloud instance to another. > From that state then we’d just reindex the delta (from when the snapshot > was taken to now) > > Appreciate any thoughts or ideas or hear how other folks do it, > > Thanks! > > -Frank > > [image: Description: Macintosh > HD:Users:jerchow:Downloads:Asset_Package_01_160721:HERE_Logo_2016:sRGB:PDF:HERE_Logo_2016_POS_sRGB.pdf] > > > > *Frank Kelly* > > *Principal Software Engineer* > > > > HERE > > 5 Wayside Rd, Burlington, MA 01803, USA > > *42° 29' 7" N 71° 11' 32" W* > > > [image: Description: > /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_360.gif] > <http://360.here.com/> [image: Description: > /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_Twitter.gif] > <https://www.twitter.com/here> [image: Description: > /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_FB.gif] > <https://www.facebook.com/here> [image: Description: > /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_IN.gif] > <https://www.linkedin.com/company/heremaps> [image: Description: > /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_Insta.gif] > <https://www.instagram.com/here/> >