On 10/14/2012 11:16 AM, Erick Erickson wrote:
No, that's not what I'm thinking at all. There would be _no_
replication configured. You'd just have two completely independent
installations, one in each of your separate locations. The only
communication path would be that somehow the original documents
would need to get to both locations for indexing.

When I was on 1.4.1, I had replication set up. Because replicating between 3.2.0 and 1.4.1 was not possible due to the javabin update, I changed over to this exact model, even though the servers are right next to each other in the racks. Now I am on 3.5.0 and testing an update to branch_4x. At this time I have no plans to change my distributed setup to SolrCloud. One day I might go to two separate single-stranded SolrCloud setups in order to simplify my indexing code. Our query volume is not high enough to require more than one online server chain. The only reason I have two chains is for high availability.

You might wonder why I would not take advantage of SolrCloud's automated replication. I have simply found too much value in having two independently updated copies of my distributed index. I wrote my indexing code such that it can actually update/reindex any arbitrary number of completely independent index chains.

When we want to make changes to our config/schema, I have a dev server where I can do almost all of the testing required, but that server is not big enough to hold the entire index. Because I have independent production indexes, I can make the proposed changes on the B chain, reindex, and test against the full index in a staging environment without affecting the actual production site. When it is time to roll changes into production, Solr's built-in enable/disable lets me switch the load balancer back and forth between the two indexes with one click. If it works, I can then update chain A the same way.

Thanks,
Shawn

Reply via email to