On 4/28/2016 9:43 AM, Nick Vasilyev wrote:
> I forgot to mention that the index is approximately 50 million docs split
> across 4 shards (replication factor 2) on 2 solr replicas.

Later in the thread, Jeff Wartes mentioned my wiki page for GC tuning.

https://wiki.apache.org/solr/ShawnHeisey#GC_Tuning_for_Solr

Oracle seems to be putting most of their recent work into the G1
collector.  They want to make it the default collector for Java 9.  I
don't know if this is going to happen, or it they will delay that change
until Java 10 ... but I think that the default *is* going to eventually
be G1.

Lucene strongly recommends against EVER using G1, but I have not noticed
any problems with it on Solr.  I have not seen any specific *reason*
that G1 is not recommended, at least with a 64-bit JVM.  I can only
remember seeing issues in Jira with G1 when the JVM was 32-bit -- which
has its own limitations, so I don't recommend it anyway.  If somebody
can point me to specific *OPEN* issues showing current problems with G1
on a 64-bit JVM, I will have an easier time believing the Lucene
assertion that it's a bad idea.  I have searched Jira and can't find
anything relevant.

The best GC results I've seen in testing Solr have been with the G1
collector.  I haven't done any testing for quite a while, and almost all
of the testing that I've done has been with 4.x versions, not 5.x.

Out of the box, Solr 5.0 and later uses GC tuning with the CMS collector
that looks a lot like the CMS config that I came up with.  This works
pretty well, but if the heap size gets big enough, especially 32GB or
larger, I suspect that it will start to have problems with GC performance.

You could give the settings I listed under "Current experiments" a try. 
You would do this by editing your solr.in.* script to comment out the
current GC tuning parameters and substituting the new set of
parameters.  This is a G1 config, and as I already mentioned, Lucene
recommends NEVER using G1.

For your specific situation, I think you should try setting the max heap
to 31GB instead of 32GB, so the pointer sizes are cut in half, without
making a huge difference in the total amount of memory available.

The chewiebug version of GCViewer is what I use to take the GC logfile
and see how Solr did with particular GC settings.  At my request, they
have incorporated some really nice statistical metrics into GCViewer,
but it's not available in the released versions -- you'll have to
compile it yourself until 1.35 comes out.

https://github.com/chewiebug/GCViewer/issues/139

I have also had some good luck with jHiccup.  That tool will inform you
of pauses that happen for *any* reason, not just because of GC.  My
experience is that GC is the only *major* cause of pauses.

Thanks,
Shawn

Reply via email to