Hi Shawn,

I've looked at the xing JVM before but don't use it.  jHiccup looks like a 
really useful tool.  Can you tell us how you are starting it up?  Do you start 
it wrapping the app container (ie tomcat / jetty)?

Thanks
Robi

-----Original Message-----
From: Shawn Heisey [mailto:s...@elyograg.org] 
Sent: Tuesday, February 05, 2013 1:27 PM
To: solr-user@lucene.apache.org
Subject: Re: Really bad query performance for date range queries

On 2/5/2013 12:51 PM, sausarkar wrote:
> We have a 96GB ram machine with 16 processors. the JVM is set to use 60 GB.
> The test that we are running are purely query there is no indexing going on.
> I dont see garbage collection when I attach visualVM but see frequent 
> CPU spikes ~once every minute.

A previous message from you indicates that your index is 12GB.  I agree with 
Erick that this is not very large.  The pauses that you have described sound a 
lot like stop-the-world garbage collection.  I've seen very long pauses on an 
8GB heap ... I don't even want to think about what could happen on 60GB.

Do you really need a 60GB heap?  My dev server handles seven index shards with 
a 7GB heap and 16GB total RAM.  On 4.1 the total index size is is over 100GB.  
On 4.2-SNAPSHOT the total index size is about 83GB. 
Query performance isn't stellar, but it works perfectly.  My production servers 
(running 3.5) have tons of RAM and each one only gets half the index, but they 
only run with the heap at 8GB.  My queries are pretty low volume and not HUGELY 
complex.  Median query time is about 26 milliseconds and 95th percentile is 
about 950 milliseconds.

Looking at the GC stats in jconsole/jvisualvm, I didn't think I had a GC pause 
problem, but I was proven wrong when I started correlating all the various logs 
in my system to load balancer "DOWN" incidents.  I saw a pause of 12 seconds 
once in the GC log - on an 8GB heap.

I was introduced to a very cool program that tracks any kind of pause that's 
caused by factors outside the Java program, like GC pauses in the JVM or 
something happening in the OS.  This is much easier to interpret than Java's GC 
logging, and you can get a nice graph from the data.

http://www.azulsystems.com/jHiccup

Using jHiccup, I was able to do a little bit of comparison between different 
runs.  That helped me find some GC tuning parameters that have almost gotten 
rid of my GC pause problem.  I'm constantly working on those parameters.  The 
current values are:

-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75
-XX:NewRatio=3
-XX:MaxTenuringThreshold=8
-XX:+CMSParallelRemarkEnabled
-XX:+ParallelRefProcEnabled
-XX:+UseLargePages
-XX:+AggressiveOpts

The Xing JVM (made by the company that created jHiccup) apparently has 
extremely low GC pause characteristics even with giant heaps like yours. 
  I'm not using it, and I don't know how much it costs.

Thanks,
Shawn



Reply via email to