After changing zookeeper time out from 10sec to 45/50 sec and monitoring for a long time i can observe servers went on recovery multiple times, but the Exceptions are some what different :
INFO - 2015-04-22 09:02:47.943; org.apache.solr.common.cloud.ConnectionManager; Watcher org.apache.solr.common.cloud.ConnectionManager@6ad2b64e name:ZooKeeperConnection Watcher:bot1:2181,bot2:2181,bot3:2181,bot4:2181,bot5:2181 got event WatchedEvent state:SyncConnected type:None path:null path:null type:None INFO - 2015-04-22 09:02:47.944; org.apache.solr.common.cloud.ConnectionManager; Client is connected to ZooKeeper INFO - 2015-04-22 09:02:47.944; org.apache.solr.common.cloud.ConnectionManager$1; Connection with ZooKeeper reestablished. WARN - 2015-04-22 09:02:47.944; org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for zkNodeName=searcher1.abc:8980_solr_dict_en_shard1_replica4core=dict_en_shard1_replica4 WARN - 2015-04-22 09:02:47.944; org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for zkNodeName=searcher1.abc:8980_solr_dict_cn_shard1_replica2core=dict_cn_shard1_replica2 WARN - 2015-04-22 09:02:47.944; org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for zkNodeName=searcher1.abc:8980_solr_dict_hk_shard1_replica4core=dict_hk_shard1_replica4 WARN - 2015-04-22 09:02:47.944; org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for zkNodeName=searcher1.abc:8980_solr_dict_jp_shard1_replica3core=dict_jp_shard1_replica3 WARN - 2015-04-22 09:02:47.945; org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for zkNodeName=searcher1.abc:8980_solr_dict_vn_shard1_replica3core=dict_vn_shard1_replica3 WARN - 2015-04-22 09:02:47.945; org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for zkNodeName=searcher1.abc:8980_solr_dict_th_shard1_replica3core=dict_th_shard1_replica3 WARN - 2015-04-22 09:02:47.945; org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for zkNodeName=searcher1.abc:8980_solr_dict_nl_shard1_replica2core=dict_nl_shard1_replica2 INFO - 2015-04-22 09:02:47.945; org.apache.solr.cloud.ZkController; publishing core=rn0 state=down INFO - 2015-04-22 09:02:47.945; org.apache.solr.cloud.ZkController; numShards not found on descriptor - reading it from system property INFO - 2015-04-22 09:02:47.951; org.apache.solr.client.solrj.impl.HttpClientUtil; Creating new http client, config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false ERROR - 2015-04-22 09:02:48.010; org.apache.solr.common.SolrException; :org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /overseer/queue/qn- at org.apache.zookeeper.KeeperException.create(KeeperException.java:127) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) at org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:218) at org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:215) at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:65) at org.apache.solr.common.cloud.SolrZkClient.create(SolrZkClient.java:215) at org.apache.solr.cloud.DistributedQueue.createData(DistributedQueue.java:284) at org.apache.solr.cloud.DistributedQueue.offer(DistributedQueue.java:271) at org.apache.solr.cloud.ZkController.publish(ZkController.java:1011) at org.apache.solr.cloud.ZkController.publish(ZkController.java:976) at org.apache.solr.handler.admin.CoreAdminHandler$2.run(CoreAdminHandler.java:811) INFO - 2015-04-22 09:02:48.010; org.apache.solr.update.DefaultSolrCoreState; Running recovery - first canceling any ongoing recovery INFO - 2015-04-22 09:02:48.012; org.apache.solr.cloud.RecoveryStrategy; Starting recovery process. core=rn0 recoveringAfterStartup=false INFO - 2015-04-22 09:02:48.016; org.apache.solr.cloud.ZkController; publishing core=rn0 state=recovering INFO - 2015-04-22 09:02:48.017; org.apache.solr.cloud.ZkController; numShards not found on descriptor - reading it from system property INFO - 2015-04-22 09:02:48.020; org.apache.solr.client.solrj.impl.HttpClientUtil; Creating new http client, config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-went-on-recovery-multiple-time-tp4196249p4201508.html Sent from the Solr - User mailing list archive at Nabble.com.