Re: fieldCache problem OOM exception

Bernd Fehling Fri, 17 Jun 2011 06:05:18 -0700

Hi Erik,

as far as I can see with MemoryAnalyzer from the heap:
- the class fieldCache has a HashMap
- one entry of the HashMap is FieldCacheImpl$StringIndex which is "mister big"
- FieldCacheImpl$StringIndex is a WeakHashMap
- WeakHashMap has three entries
-- 63.58 percent of heap
--  8.14 percent of heap
--  1.74 percent of heap


All I know is that WeakHashMap should be garbage collectable, isn't it?

When building HashMap or WeakHashMap there are only 2 parameters possible,
the initial capacity and the load factor. I see in my heap dump:
float  DEFAULT_LOAD_FACTOR  0,75
int    DEFAULT_INITIAL_CAPACITY  16

But when looking into the statics I also have
int  MAXIMUM_CAPACITY  1.037.341.824
If I understand this right than a HashMap/WeakHashMap can have over 1 billion 
buckets.
Thats huge.
And can't be reduced by parameter :-(


Another thing to mention about fieldCache:
insanity_count : 1

insanity#0 : SUBREADER: Found caches for decendents of ReadOnlyDirectoryReader(segments_ov _s1u(3.2):C28940964)+f_dcyear'ReadOnlyDirectoryReader(segments_ov _s1u(3.2):C28940964)'=>'f_dcyear',class org.apache.lucene.search.FieldCache$StringIndex,null=>org.apache.lucene.search.FieldCache$StringIndex#1574857404'org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput@f17ea34'=>'f_dcyear',

class 
org.apache.lucene.search.FieldCache$StringIndex,null=>org.apache.lucene.search.FieldCache$StringIndex#179165101

What does this tell me?

Do I have problems with field f_dcyear (and if so, why)?


Regards
Bernd



Am 17.06.2011 14:13, schrieb Erick Erickson:

Sorry, it was late last night when I typed that...

Basically, if you sort and facet on #all# the fields you mentioned, it
should populate
the cache in one go. If the problem is that you just have too many unique terms
for all those operations, then it should go bOOM.

But, frankly, that's unlikely, I'm just suggesting that to be sure the
easy case isn't
the problem. Take a memory snapshot at that point just to see, it should be a
high-water mark.

The fact that you increase the heap and can then run for longer is extremely
suspicious, and really smells like a memory issue, so we'd like to pursue it.

I'd be really interested if anyone else is seeing anything similar,
these are the
scary ones...

Best
Erick

On Fri, Jun 17, 2011 at 3:09 AM, Bernd Fehling
<bernd.fehl...@uni-bielefeld.de>  wrote:

Hi Erik,
I will take some memory snapshots during the next week,
but how can it be to get OOMs with one query?

- I started with 6g for JVM -->  1 day until OOM.
- increased to 8 g -->  2 days until OOM
- increased to 10g -->  3.5 days until OOM
- increased to 16g -->  5 days until OOM
- currently 20g -->  about 7 days until OOM

Starting the system takes about 3.5g and goes up to about 4g after a while.

The only dirty workaround so far is to restart the whole system after 5
days.
Not really nice.

The problem seams to be fieldCache which is under the hood of jetty.
Do you know of any sizing features for fieldCache to limit the memory
consumption?

Regards
Bernd

Am 17.06.2011 03:37, schrieb Erick Erickson:


Well, if my theory is right, you should be able to generate OOMs at will
by
sorting and faceting on all your fields in one query.

But Lucene's cache should be garbage collected, can you take some memory
snapshots during the week? It should hit a point and stay steady there.

How much memory are you giving your JVM? It looks like a lot given your
memory snapshot.

Best
Erick

On Thu, Jun 16, 2011 at 3:01 AM, Bernd Fehling
<bernd.fehl...@uni-bielefeld.de>    wrote:


Hi Erik,

yes I'm sorting and faceting.

1) Fields for sorting:
   sort=f_dccreator_sort, sort=f_dctitle, sort=f_dcyear
   The parameter "facet.sort=" is empty, only using parameter "sort=".

2) Fields for faceting:
   f_dcperson, f_dcsubject, f_dcyear, f_dccollection, f_dclang,
f_dctypenorm,
f_dccontenttype
   Other faceting parameters:


...&facet=true&facet.mincount=1&facet.limit=100&facet.sort=&facet.prefix=&...

3) The LukeRequestHandler takes too long for my huge index so this is
from
   the standalone luke (compiled for solr3.2):
   f_dccreator_sort = 10.029.196
   f_dctitle        = 21.514.939
   f_dcyear         =      1.471
   f_dcperson       = 14.138.165
   f_dcsubject      =  8.012.319
   f_dccollection   =      1.863
   f_dclang         =        299
   f_dctypenorm     =         14
   f_dccontenttype  =        497

numDocs:    28.940.964
numTerms:  686.813.235
optimized:        true
hasDeletions:    false

What can you read/calculate from this values?

Is my index to big for Lucene/Solr?

What I don't understand, why fieldCache is not garbage collected
and therefore reduced in size from time to time.

Regards
Bernd

Am 15.06.2011 17:50, schrieb Erick Erickson:


The first question I have is whether you're sorting and/or
faceting on many unique string values? I'm guessing
that sometime you are. So, some questions to help
pin it down:
1>      what fields are you sorting on?
2>      what fields are you faceting on?
3>      how many unique terms in each (see the solr admin page).

Best
Erick

On Wed, Jun 15, 2011 at 8:22 AM, Bernd Fehling
<bernd.fehl...@uni-bielefeld.de>      wrote:


Dear list,

after getting OOM exception after one week of operation with
solr 3.2 I used MemoryAnalyzer for the heapdumpfile.
It looks like the fieldCache eats up all memory.

                                                    Objects     Shalow
Heap
   Retained Heap
org.apache.lucene.search.FieldCache                       0
0


= 14,636,950,632


org.apache.lucene.search.FieldCacheImpl                   1
  32


= 14,636,950,384


org.apache.lucene.search.FieldCacheImpl$StringIndexCache  1
  32


= 14,636,947,080


org.apache.lucene.search.FieldCache$StringIndex          10
320


= 14,636,944,352


java.lang.String[]                                      519
567,811,040


= 13,503,733,312


char[]                                           81,766,595
  11,604,293,712


= 11,604,293,712


fieldCache retains over 14g of heap.

When looking on stats page under fieldCache the description says:
"Provides introspection of the Lucene FieldCache, this is **NOT** a
cache
that is managed by Solr."

So is this a jetty problem and not solr?

Why is fieldCache growing and growing until OOM?

Regards
Bernd

Re: fieldCache problem OOM exception

Reply via email to