A step I meant to include was that after you "warm" Solr with a
representative collection of queries that references all of the fields,
facets, sorting, etc. that your daily load will reference, check the Java
heap size at that point, and then set your Java heap limit to a moderate
level higher, like 256M, restart, and then see what happens.
The theory is that if you have too much available heap, Java will gradually
fill it all with garbage (no leaks implied, but maybe some leaks as well),
and then a Java GC will be an expensive hit, and sometimes a rapid flow of
incoming requests at that point can cause Java to freak out and even hit OOM
even though a more graceful garbage collection would eventually free up tons
of garbage.
So, by only allowing for a moderate amount of garbage, more frequent GCs
will be less intensive and less likely to cause weird situations.
The other part of the theory is that it is usually better to leave tons of
memory to the OS for efficiently caching files, rather than force Java to
manage large amounts of memory, which it typically does not do so well.
-- Jack Krupansky
-----Original Message-----
From: Jack Krupansky
Sent: Sunday, March 24, 2013 2:00 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr using a ridiculous amount of memory
Just to get started, do you hit OOM quickly with a few expensive queries, or
is it after a number of hours and lots of queries?
Does Java heap usage seem to be growing linearly as queries come in, or are
there big spikes?
How complex/rich are your queries (e.g., how many terms, wildcards, faceted
fields, sorting, etc.)?
As a baseline experiment, start a Solr server, see how much Java heap is
used/available. Then do a couple of typical queries, and check the heap size
again. Then do a couple more similar but different (to avoid query cache
matches), and check the heap again. Maybe do that a few times to get a
handle on the baseline memory required and whether there might be a leak of
some sort. Do enough queries to hits all of the fields, facets, sorting,
etc. that are likely to be encountered in one of your typical days that hits
OOM - just not the volume of queries. The goal is to determine if there is
something inherently memory intensive in your index/queries, or something
relating to a leak based on total query volume.
-- Jack Krupansky
-----Original Message-----
From: John Nielsen
Sent: Sunday, March 24, 2013 4:19 AM
To: solr-user@lucene.apache.org
Subject: Solr using a ridiculous amount of memory
Hello all,
We are running a solr cluster which is now running solr-4.2.
The index is about 35GB on disk with each register between 15k and 30k.
(This is simply the size of a full xml reply of one register. I'm not sure
how to measure it otherwise.)
Our memory requirements are running amok. We have less than a quarter of
our customers running now and even though we have allocated 25GB to the JVM
already, we are still seeing daily OOM crashes. We used to just allocate
more memory to the JVM, but with the way solr is scaling, we would need
well over 100GB of memory on each node to finish the project, and thats
just not going to happen. I need to lower the memory requirements somehow.
I can see from the memory dumps we've done that the field cache is by far
the biggest sinner. Of special interest to me is the recent introduction of
DocValues which supposedly mitigates this issue by using memory outside the
JVM. I just can't, because of lack of documentation, seem to make it work.
We do a lot of facetting. One client facets on about 50.000 docs of approx
30k each on 5 fields. I understand that this is VERY memory intensive.
Schema with DocValues attempt at solving problem:
http://pastebin.com/Ne23NnW4
Config: http://pastebin.com/x1qykyXW
The cache is pretty well tuned. Any lower and i get evictions.
Come hell or high water, my JVM memory requirements must come down. Simply
moving some memory load outside of the JVM would be awesome! Making it not
use the field cache for anything would also (probably) work for me. I
thought about killing off my other caches, but from the dumps, they just
don't seem to use that much memory.
I am at my wits end. Any help would be sorely appreciated.
--
Med venlig hilsen / Best regards
*John Nielsen*
Programmer
*MCB A/S*
Enghaven 15
DK-7500 Holstebro
Kundeservice: +45 9610 2824
p...@mcb.dk
www.mcb.dk