Just to get started, do you hit OOM quickly with a few expensive queries, or
is it after a number of hours and lots of queries?
Does Java heap usage seem to be growing linearly as queries come in, or are
there big spikes?
How complex/rich are your queries (e.g., how many terms, wildcards, faceted
fields, sorting, etc.)?
As a baseline experiment, start a Solr server, see how much Java heap is
used/available. Then do a couple of typical queries, and check the heap size
again. Then do a couple more similar but different (to avoid query cache
matches), and check the heap again. Maybe do that a few times to get a
handle on the baseline memory required and whether there might be a leak of
some sort. Do enough queries to hits all of the fields, facets, sorting,
etc. that are likely to be encountered in one of your typical days that hits
OOM - just not the volume of queries. The goal is to determine if there is
something inherently memory intensive in your index/queries, or something
relating to a leak based on total query volume.
-- Jack Krupansky
-----Original Message-----
From: John Nielsen
Sent: Sunday, March 24, 2013 4:19 AM
To: solr-user@lucene.apache.org
Subject: Solr using a ridiculous amount of memory
Hello all,
We are running a solr cluster which is now running solr-4.2.
The index is about 35GB on disk with each register between 15k and 30k.
(This is simply the size of a full xml reply of one register. I'm not sure
how to measure it otherwise.)
Our memory requirements are running amok. We have less than a quarter of
our customers running now and even though we have allocated 25GB to the JVM
already, we are still seeing daily OOM crashes. We used to just allocate
more memory to the JVM, but with the way solr is scaling, we would need
well over 100GB of memory on each node to finish the project, and thats
just not going to happen. I need to lower the memory requirements somehow.
I can see from the memory dumps we've done that the field cache is by far
the biggest sinner. Of special interest to me is the recent introduction of
DocValues which supposedly mitigates this issue by using memory outside the
JVM. I just can't, because of lack of documentation, seem to make it work.
We do a lot of facetting. One client facets on about 50.000 docs of approx
30k each on 5 fields. I understand that this is VERY memory intensive.
Schema with DocValues attempt at solving problem:
http://pastebin.com/Ne23NnW4
Config: http://pastebin.com/x1qykyXW
The cache is pretty well tuned. Any lower and i get evictions.
Come hell or high water, my JVM memory requirements must come down. Simply
moving some memory load outside of the JVM would be awesome! Making it not
use the field cache for anything would also (probably) work for me. I
thought about killing off my other caches, but from the dumps, they just
don't seem to use that much memory.
I am at my wits end. Any help would be sorely appreciated.
--
Med venlig hilsen / Best regards
*John Nielsen*
Programmer
*MCB A/S*
Enghaven 15
DK-7500 Holstebro
Kundeservice: +45 9610 2824
p...@mcb.dk
www.mcb.dk