Re: Solr 7.6 frequent OOM with Java 9, G1 and large heap sizes - any tests with Java 13 and the new ZGC?

Shawn Heisey Mon, 14 Oct 2019 07:22:36 -0700

On 10/14/2019 7:18 AM, Vassil Velichkov (Sensika) wrote:

After the migration from 6.x to 7.6 we kept the default GC for a couple of 
weeks, than we've started experimenting with G1 and we've managed to achieve 
less frequent OOM crashes, but not by much.

Changing your GC settings will never prevent OOMs. The only way toprevent them is to either increase the resource that's running out orreconfigure the program to use less of that resource.

As I explained in my previous e-mail, the unused filterCache entries are not 
discarded, even after a new SolrSearcher is started. The Replicas are synced 
with the Masters every 5 minutes, the filterCache is auto-warmed and the JVM 
heap utilization keeps going up. Within 1 to 2 hours a 64GB heap is being 
exhausted. The GC log entries clearly show that there are more and more 
humongous allocations piling up.

While it is true that the generation-specific collectors for G1 do notclean up humungous allocations from garbage, eventually Java willperform a full GC, which will be slow, but should clean them up. If afull GC is not cleaning them up, that's a different problem, and onethat I would suspect is actually a problem with your installation. Wehave had memory leak bugs in Solr, but I am not aware of any that are asserious as your observations suggest.

You could be running into a memory leak ... but I really doubt that itis directly related to the filterCache or the humungous allocations.Upgrading to the latest release that you can would be advisable -- thelatest 7.x version would be my first choice, or you could go all the wayto 8.2.0.

Are you running completely stock Solr, or have you added custom code?One of the most common problems with custom code is leaking searcherobjects, which will cause Java to retain the large cache entries. Wehave seen problems where one Solr version will work perfectly withcustom code, but when Solr is upgraded, the custom code has memory leaks.

We have a really stressful use-case: a single user opens a live-report with 
20-30 widgets, each widget performs a Solr Search or facet aggregations, 
sometimes with 5-15 complex filter queries attached to the main query, so the 
end results are visualized as pivot charts. So, one user could trigger hundreds 
of queries in a very short period of time and when we have several analysts 
working on the same time-period, we usually end-up with OOM. This logic used to 
work quite well on Solr 6.x. The only other difference that comes to my mind is 
that with Solr 7.6 we've started using DocValues. I could not find 
documentation about DocValues memory consumption, so it might be related.

For cases where docValues are of major benefit, which is primarilyfacets and sorting, Solr will use less memory with docValues than itdoes with indexed terms. Adding docValues should not result in adramatic increase in memory requirements, and in many cases, shouldactually require less memory.

Yep, but I plan to generate some detailed JVM trace-dumps, so we could analyze 
which class / data structure causes the OOM. Any recommendations about what 
tool to use for a detailed JVM dump?

Usually the stacktrace itself is not helpful in diagnosing OOMs --because the place where the error is thrown can be ANY allocation, notnecessarily the one that is the major resource hog.

What I'm interested in here is the message immediately after the OOME,not the stacktrace. Which I'll admit is slightly odd, because for manyproblems I *am* interested in the stacktrace. OutOfMemoryError is onesituation where the stacktrace is not very helpful, but the shortmessage the error contains is useful. I only asked for the stacktracebecause collecting it will usually mean that nothing else in the messagehas been modified.


Here are two separate examples of what I am looking for:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

Caused by: java.lang.OutOfMemoryError: unable to create new native thread

Also, not sure if I could send attachments to the mailing list, but there must 
be a way to share logs...?

There are many websites that facilitate file sharing. One example, andthe one that I use most frequently, is dropbox. Sending attachments tothe list rarely works.


Thanks,
Shawn

Re: Solr 7.6 frequent OOM with Java 9, G1 and large heap sizes - any tests with Java 13 and the new ZGC?

Reply via email to