I have a lot of problem with the stability of my cloud. 

To improve the stability:

- Move zookeeper to another disk, the I/O from solr.home can kill your ensemble.

- Raise the zkTimeoutLimit to 60s

- Don't use a very big heap if you don't need, try with values around 4g and 
increase until OOM doesn't happen.

- Use the recommendations to tune the heap from 
http://wiki.apache.org/solr/ShawnHeisey#GC_Tuning, 99% of my problems with 
zookeeper was fixed.

- Log gc times, I discover pauses of 32s on my boxes, totally killer for 
zookeeper, the result, tons of session expired. 


-- 
Yago Riveiro
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Thursday, December 19, 2013 at 5:45 PM, Shawn Heisey wrote:

> On 12/19/2013 3:44 AM, ilay raja wrote:
> > I have deployed solr cloud with external zookeeper ensemble (5
> > instances). I am running solr instances on two servers with single shard
> > index. There are 6 replicas. I often see solr going down during high search
> > load (or) whenever i run indexing documents. I tried tuning hardcommit
> > (kept as 15 mins) and softcommits(12 mins). Also, set zkClientTimeout as 30
> > secs. I observed sometimes OOM, Socket exceptions., EOF exceptions in solr
> > logs while the instance is going down. Also, zookeeper recovery for the
> > solr instance is going in loop .... My use case is sort of high search (100
> > queries per sec) / heavy indexing (10 K docs per minute). What is the best
> > way to keep stable solr cloud isntances with external ensemble. Should we
> > try running zookeeper internally, because looks like zookeeper handshaking
> > might be an issue as well. Is solr cloud stable for production ? or there
> > are open issues still. Please guide me.
> > 
> 
> 
> You definitely do not want to run zookeeper embedded in Solr. The
> simple reason for this is simply because if you stop Solr, you also stop
> zookeeper. Zookeeper works best if it remains up all the time, so an
> external ensemble is highly recommended.
> 
> It's probably a good idea to set the max heap on the zookeeper startup
> ... one of my zk java instances is using 65MB resident memory, so unless
> it's a very large cloud, a low number like 128MB would probably be enough.
> 
> I've heard that heavy I/O on the disk with the zookeeper data can cause
> problems for zookeeper. This is the one danger that can come from
> putting both Solr and an external zookeeper on the same host, which is
> usually a very safe thing to do. Unless you've got very fast I/O, it's
> recommended that the zookeeper data is put on separate disk spindles
> from anything else. When Solr has performance problems, it's usually
> from heavy I/O, and if heavy I/O is causing problems with zookeeper,
> then the problem just compounds itself.
> 
> You haven't indicated how big the java heap for Solr is. Severe
> stability problems can result from GC pauses, so it's extremely
> important to tune your garbage collection unless your Solr max heap is
> very very small (less than 1GB). Here's my personal wiki page with
> settings that work for me, they seem to work for others too:
> 
> http://wiki.apache.org/solr/ShawnHeisey#GC_Tuning
> 
> Severe GC pause problems can also result from the Solr java heap being
> too small. Here's a more involved wiki page on performance issues that
> I have seen:
> 
> http://wiki.apache.org/solr/SolrPerformanceProblems
> 
> Thanks,
> Shawn
> 
> 


Reply via email to