Re: Investigating performance issues in solr cloud

Shawn Heisey Tue, 08 Apr 2014 18:03:07 -0700

On 4/8/2014 6:48 PM, Utkarsh Sengar wrote:
> 1. I am using Oracle JVM
> user@host:~$ java -version
> java version "1.6.0_45"
> Java(TM) SE Runtime Environment (build 1.6.0_45-b06)
> Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01, mixed mode)


That version should be very good, until you need to upgrade to Solr 4.8.
 The problems in the latest Java 7 releases are supposed to be fixed in
7u60, due out in May 2014.  A pre-release of 7u60 is available now on
https://jdk7.java.net if you want to try it in a dev environment.

> 2. I will try out jHiccup and your GC settings.

The only GC-related graph I saw in what you provided from New Relic was
GC CPU time, and I don't think that particular metric would ever get
very high, even when a long stop-the-world pause is happening.  Many of
the common tools that people look to for GC information are not able to
detect long GC pauses.

> 3. Yes, I am running ZK instances in an ensemble. I didn't know I need to
> pass all the instances of ZK to a single solr node. I will try it out right
> now. This means if you have a large cluster say of 50 solr nodes and 10 ZK
> nodes then I will need to pass all the 10 nodes to -DzkHost of the 50 solr
> processes? What is the reasoning behind this?

I would not think that any more than 3 or 5 ZK nodes would ever be
needed.  If you really did have ten of them, you probably could just
tell Solr about some of them instead of all of them.  I think you'd want
to be sure that all Solr servers were pointed at the same set.

It's really just so that Solr knows who to contact if the ZK server it's
currently talking to goes down or is unreachable for a time that exceeds
zkClientTimeout.  GC pause problems can be severe enough to exceed the
default zkClientTimeout of 15 seconds.

Thanks,
Shawn

Re: Investigating performance issues in solr cloud

Reply via email to