On 8/16/2019 8:23 AM, Rohan Kasat wrote:
I have a Solr Cloud setup of 3 solr servers 7.5 version.
24GB heap memory is allocated to each solr server and i have around 655 GB
of data in indexes to be searched for.
Few last 2-3 days, the solr servers are crashing and am able to see the
heap memory is almost full but the CPU usage is just 1 %.
I am attaching the gc logs from 3 servers. Can you please help in analyzing
yje logs and comments to improve
https://gist.github.com/rohankasat/cee8203c0c12983d9839b7a59047733b
These three GC logs do not indicate that all the heap is used.
The peak heap usage during these GC logs is 18.86GB, 19.42GB, and
18.91GB. That's quite a bit below the 24GB max.
There are some very long GC pauses recorded. Increasing the heap size
MIGHT help with that, or it might not.
The typical way that Solr appears to "crash" is when an OutOfMemoryError
exception is thrown, at which time a Solr instance that is running on an
OS like Linux will kill itself with a -9 signal. This scripting is not
present when starting on Windows.
An OOME can be thrown for a resource other than memory, so despite the
exception name, it might not actually be memory that has been depleted.
The exception will need to be examined to learn why it was thrown.
GC logs do not indicate the cause of OOME. If that information is
logged at all, and it might not be, it will be in solr.log.
Looking at the GC logs to see how your Solr is laid out... the following
command might find the cause, if it was logged, and if the relevant log
has not been rotated out:
grep -r OutOfMemory /apps/solr/solr_data/logs/*
At the very least it might help you find out which log file to
investigate further.
Thanks,
Shawn