On 4/8/2014 6:48 PM, Utkarsh Sengar wrote: > 1. I am using Oracle JVM > user@host:~$ java -version > java version "1.6.0_45" > Java(TM) SE Runtime Environment (build 1.6.0_45-b06) > Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01, mixed mode)
That version should be very good, until you need to upgrade to Solr 4.8. The problems in the latest Java 7 releases are supposed to be fixed in 7u60, due out in May 2014. A pre-release of 7u60 is available now on https://jdk7.java.net if you want to try it in a dev environment. > 2. I will try out jHiccup and your GC settings. The only GC-related graph I saw in what you provided from New Relic was GC CPU time, and I don't think that particular metric would ever get very high, even when a long stop-the-world pause is happening. Many of the common tools that people look to for GC information are not able to detect long GC pauses. > 3. Yes, I am running ZK instances in an ensemble. I didn't know I need to > pass all the instances of ZK to a single solr node. I will try it out right > now. This means if you have a large cluster say of 50 solr nodes and 10 ZK > nodes then I will need to pass all the 10 nodes to -DzkHost of the 50 solr > processes? What is the reasoning behind this? I would not think that any more than 3 or 5 ZK nodes would ever be needed. If you really did have ten of them, you probably could just tell Solr about some of them instead of all of them. I think you'd want to be sure that all Solr servers were pointed at the same set. It's really just so that Solr knows who to contact if the ZK server it's currently talking to goes down or is unreachable for a time that exceeds zkClientTimeout. GC pause problems can be severe enough to exceed the default zkClientTimeout of 15 seconds. Thanks, Shawn