SolrCloud: Understanding Replication

Marc Campeau Fri, 30 May 2014 06:55:39 -0700

Hi,

I currently have a standalone SOLR 4.5.1deployment on an EC2 instance with
a single collection and core containing an index that's roughly 10G. I've
used this as a proof of concept, prototype and staging during development
phases and I'm about to release to production.


For this release, I've setup 4 EC2 instances with 3 servers in Zookeeper
ensemble and the 4 servers running SolrCloud. My intention is to have my
current collection on a single Shard replicated 4 times based on the high
availability requirements. For that,  I'm using an ELB as load balancer to
spread the query load to all 4 instances. For this, I've rsync'ed my
current 10G Collection to all 4 SOLR instances in my SolrCloud and started
them all up. They all come up and do the elections and what nots and all
are queryable, which is great. The idea being to load they current index as
it is and then start updating it instead of reindexing it all from scratch.

BUT...

1) Using zkCLI, I can see that clusterstate shows all instances as down and
this is illustrated on the Solr Admin interface by showing all 4 instances
using the down color. Is that normal? How can I change that? How come if
all 4 instances answer queries just fine?

2) It doesn't seem like the instances are replicating... aka if I add a
document to the collection it doesn't get replicated to the other
instances. Why is that? What should I look for in SOLR logs that would tell
me that replication is happening? I clearly see in there the "/admin/ping"
requests made by the load balancer doing health checks and requests made to
the admin interface but can never find requests made to "/replicate" that
would trigger the replication handler.

There's obviously something I've done wrong put I can't put my finger on
it. I would appreciate your insight on my situation.

Thanks,

Marc Campeau

SolrCloud: Understanding Replication

Reply via email to