Can you tell what version of solr you are using and what causes your replicas to go into recovery.
On Fri, Jan 23, 2015 at 8:40 PM, gouthsmsimhadri <gouthamsimha...@gmail.com> wrote: > I'm working with a cluster of solr-cloud servers at a configration of 10 > shards and 4 replicas on each shard in stress environment. > Planned production configuration is 10 shards and 15 replicas on each > shard. > > Current commit settings are as follows > > <autoSoftCommit> > <maxDocs>500000</maxDocs> > <maxTime>180000</maxTime> > </autoSoftCommit> > > <autoCommit> > <maxDocs>2000000</maxDocs> > <maxTime>180000</maxTime> > <openSearcher>false</openSearcher> > </autoCommit> > > > The application requires to index approximately 90 Million docs which is > indexed in two ways > a) Full indexing. It takes 4 hours to index 90 Million docs and the > rate of > docs coming to the searcher is around 6000 per second > b) Incremental indexing. It takes an hour to index delta changes. > Roughly > there are 3 million changes and rate of docs coming to the searchers is > 2500 > per second > > I use two collections for example collection1 and collection2 > Each collection has system settings at 12 GB of available RAM and quad core > Intel(R) Xeon(R) CPU X5570 @ 2.93GHz > > Full indexing is always performed on a collection which is not serving live > traffic and Once job is completed we swap collection so the collection with > latest data serves traffic and other is inactive. > > The other mode of incremental indexing is performed always on the > collection which is serving live traffic. > > The problem is in about 10 minutes of indexing is triggered, the replicas > goes in to recovery mode. This happens on all the shards. In about 20 > minutes or more rest of replicas start to fall into recovery mode. In about > half an hour all replicas except the leader is in recovery mode. > > I cannot throttle the indexing load as that will increase our overall > indexing time. So to overcome this issue, I remove all the replicas before > the indexing is started and then add them after the indexing completes. > > The behavior(replicas falling into recovery mode) in incremental mode of > indexing is troublesome as i cannot remove replicas during incremental > indexing since it serves live traffic, i tried to throttle the speed at > which documents are indexed but with no success as the cluster still goes > on > recovery. > > If i let the cluster as is the indexing eventually completes and also > recovers after a while, but since this is serving live traffic i just > cannot > let these replicas go into recovery mode since it degrades the search > performance also (from the tests performed). > > I tried different commit settings like the below > a) No auto soft commit, no auto hard commit and a commit triggered at > the > end of indexing > b) No auto soft commit, yes auto hard commit and a commit in the end > of > indexing > c) Yes auto soft commit , no auto hard commit > d) Yes auto soft commit , yes auto hard commit > e) Different frequency setting for commits for above > > Unfortunately all the above yields the same behavior . The replicas still > goes in recovery > > I have increased the zookeeper timeout from 30 seconds to 5 minutes and the > problem persists. > > Is there any setting that would fix this issue ? > > > > > ----- > -goutham > -- > View this message in context: > http://lucene.472066.n3.nabble.com/SolrCloud-Replicas-fall-into-recovery-mode-right-after-update-tp4181706.html > Sent from the Solr - User mailing list archive at Nabble.com. >