Hello,

We have just moved from solr 4.6 master/slave to 6.4.2 SolrCloud.  We have 
three collections, each with a single shard and a varying number of replicas, 
all kept by an ensemble of three zooKeepers (on their own hosts).  As an 
ecommerce site, our capacity needs vary so we add and remove replicas with some 
frequency.  The basic topology is like this:

solr1
  |- collection1
        |- shard1 - replica1
  |- collection2
        |- shard1 - replica1
  |- collection3
        |- shard1 - replica1
         .
         .
         .
solrN
  |- collection1
        |- shard1 - replicaN
  |- collection2
        |- shard1 - replicaN
  |- collection3
        |- shard1 - replicaN

Where N varies between three and six most of the time.

During a recent test, we ran our indexing processes to a set of nodes, and then 
two nodes were removed from our configuration.  Subsequently the remaining 
nodes were reindexed, without problems.  The two nodes that had been previously 
removed (by simply stopping solr on those boxes) were brought back into the 
cluster by starting solr with the appropriate zkHost strings.  (These were the 
same zkHosts as when the instances were stopped.)  We found that the indexes 
did not synch up until we re-indexed the entire cluster.

What are we missing?  We need the re-added indexes to synchronize with those 
already active in the cluster.  If we have to re-index the whole cluster, we 
risk inconsistent results being served from the new nodes while indexing is 
going on.  In reviewing the Reference Guide and doing various searches, I 
haven't found anything that clearly references adding replicas to a cluster 
when the cores already contain data.

Thank you for any insights,
Joe

Joe Heasly, Systems Analyst I
L.L.Bean, Inc. ~ Direct Channel Business & Technology Team
Office: 207.552.2254
Cell:    207.756.9250

Reply via email to