On 1/26/2015 2:26 PM, Vijay Sekhri wrote: > Hi Erick, > In solr.xml file I had zk timeout set to/ <int > name="zkClientTimeout">${zkClientTimeout:450000}</int>/ > One thing that made a it a bit better now is the zk tick time and > syncLimit settings. I set it to a higher value as below. This may not > be advisable though. > > tickTime=30000 > initLimit=30 > syncLimit=20 > > Now we observed that replicas do not go in recovery that often as > before. In the whole cluster at a given time I would have a couple of > replicas in recovery whereas earlier it were multiple replicas from > every shard . > On the wiki https://wiki.apache.org/solr/SolrCloudit says the "The > maximum is 20 times the tickTime." in the FAQ so I decided to increase > the tick time. Is this the correct approach ?
The default zkClientTimeout on recent Solr versions is 30 seconds, up from 15 in slightly older releases. Those values of 15 or 30 seconds are a REALLY long time in computer terms, and if you are exceeding that timeout on a regular basis, something is VERY wrong with your Solr install. Rather than take steps to increase your timeout beyond the normal maximum of 40 seconds (20 times a tickTime of 2 seconds), figure out why you're exceeding that timeout and fix the performance problem. The zkClientTimeout value that you have set, 450 seconds, is seven and a half *MINUTES*. Nothing in Solr should ever take that long. "Not enough memory in the server" is by far the most common culprit for performance issues. Garbage collection pauses are a close second. I don't actually know this next part for sure, because I've never looked into the code, but I believe that increasing the tickTime, especially to a value 15 times higher than default, might make all zookeeper operations a lot slower. Thanks, Shawn