On 6/25/2020 2:08 PM, Odysci wrote:
I have a solrcloud setup with 12GB heap and I've been trying to optimize it
to avoid OOM errors. My index has about 30million docs and about 80GB
total, 2 shards, 2 replicas.

Have you seen the full OutOfMemoryError exception text? OOME can be caused by problems that are not actually memory-related. Unless the error specifically mentions "heap space" we might be chasing the wrong thing here.

When the queries return a smallish number of docs (say, below 1000), the
heap behavior seems "normal". Monitoring the gc log I see that young
generation grows then when GC kicks in, it goes considerably down. And the
old generation grows just a bit.

However, at some point i have a query that returns over 300K docs (for a
total size of approximately 1GB). At this very point the OLD generation
size grows (almost by 2GB), and it remains high for all remaining time.
Even as new queries are executed, the OLD generation size does not go down,
despite multiple GC calls done afterwards.

Assuming the OOME exceptions were indeed caused by running out of heap, then the following paragraphs will apply:

G1 has this concept called "humongous allocations". In order to reach this designation, a memory allocation must get to half of the G1 heap region size. You have set this to 4 megabytes, so any allocation of 2 megabytes or larger is humongous. Humongous allocations bypass the new generation entirely and go directly into the old generation. The max value that can be set for the G1 region size is 32MB. If you increase the region size and the behavior changes, then humongous allocations could be something to investigate.

In the versions of Java that I have used, humongous allocations can only be reclaimed as garbage by a full GC. I do not know if Oracle has changed this so the smaller collections will do it or not.

Were any of those multiple GCs a Full GC? If they were, then there is probably little or no garbage to collect. You've gotten a reply from "Zisis T." with some possible causes for this. I do not have anything to add.

I did not know about any problems with maxRamMB ... but if I were attempting to limit cache sizes, I would do so by the size values, not a specific RAM size. The size values you have chosen (8192 and 16384) will most likely result in a total cache size well beyond the limits you've indicated with maxRamMB. So if there are any bugs in the code with the maxRamMB parameter, you might end up using a LOT of memory that you didn't expect to be using.

Thanks,
Shawn

Reply via email to