Bernd, in our case, optimizing the index seems to flush the FieldCache for
some reason. On the other hand, doing a few commits without optimizing seems
to make the problem worse.
Hope that helps, we would like to give it a try and debug this in Lucene,
but are pressed for time right now. Perhaps later next week we will.
Best,
Santiago
On Fri, Jul 22, 2011 at 4:01 AM, Bernd Fehling <
bernd.fehl...@uni-bielefeld.de> wrote:
> The current status of my installation is that with some tweeking of
> JAVA I get a runtime of about 2 weeks until OldGen (14GB) is filled
> to 100 percent and won't free anything even with FullGC.
> The part of fieldCache in a HeapDump to that time is over 80 percent
> from the whole heap (20GB). And that is what eats up all OldGen
> until OOM.
> Next week I will start with tomcat 6.x to see how that one behaves, but
> there isn't any hope. It is just a different container which wouldn't
> change anything about how Lucene eats up memory with fieldCache.
>
> After digging through all the code, logging and debugging I can say that it
> seams to be not a memory leak.
>
> Solr is using the fieldCache from Lucene under the hood of the servlet
> container.
> The fieldCache grows until everything cachable is in memory or OOM
> is reached, what ever comes first.
>
> The description says: "Provides introspection of the Lucene FieldCache,
> this is **NOT** a cache that is managed by Solr."
> So it seams to be a Lucene problem.
>
> As a matter of fact and due to this limitation solr can't be used
> with a single huge index. I don't know how other applications which are
> using Lucene and it's fieldCache (and there are a lot of them) are
> handling this and how they take care of the size of the fieldCache.
> And, I currently don't know how to calculate the limit.
> Say for example: the size of *.tii and *.tis file in the index should be
> the -Xmx size of your JAVA to be save with fieldCache and
> OOM.
>
> May be an expert can give more detailed info about fieldCache and its
> possible maximum size.
>
> Some data about our index:
> -rw-r--r-- 1 solr users 84448291214 19. Jul 10:43 _12jl.fdt
> -rw-r--r-- 1 solr users 236458468 19. Jul 10:43 _12jl.fdx
> -rw-r--r-- 1 solr users1208 19. Jul 10:30 _12jl.fnm
> -rw-r--r-- 1 solr users 19950615826 19. Jul 11:20 _12jl.frq
> -rw-r--r-- 1 solr users 532031548 19. Jul 11:20 _12jl.nrm
> -rw-r--r-- 1 solr users 20616887682 19. Jul 11:20 _12jl.prx
> -rw-r--r-- 1 solr users 291149087 19. Jul 11:20 _12jl.tii
> -rw-r--r-- 1 solr users 30850743727 19. Jul 11:20 _12jl.tis
> -rw-r--r-- 1 solr users 20 9. Jun 11:11 segments.gen
> -rw-r--r-- 1 solr users 274 19. Jul 11:20 segments_pl
> Size: 146,15 GB
> Docs: 29.557.308
>
>
> Regards,
> Bernd
>
>
> Am 22.07.2011 00:10, schrieb Santiago Bazerque:
>
> Hello Erick,
>>
>> I have a 1.7MM documents, 3.6GB index. I also hava an unusual amount of
>> dynamic fields, that I use for sorting. My FieldCache currently has about
>> 13.000 entries, even though my index only has 1-3 queries per second. Each
>> query sorts by two dynamic fields, and facets on 3-4 fields that are
>> fixed.
>> These latter fields are always in the field cache, what I find suspicious
>> is
>> the other ~13.000 that are sitting there.
>>
>> I am using a 32GB heap, and I am seeing periodical OOM errors (I didn't
>> spot
>> a regular pattern as Bernd did, but haven't increased RAM as methodically
>> as
>> he has).
>>
>> If you need any more info, I'll be glad to post it to the list.
>>
>> Best,
>> Santiago
>>
>> On Fri, Jun 17, 2011 at 9:13 AM, Erick
>> Erickson
>> >wrote:
>>
>> Sorry, it was late last night when I typed that...
>>>
>>> Basically, if you sort and facet on #all# the fields you mentioned, it
>>> should populate
>>> the cache in one go. If the problem is that you just have too many unique
>>> terms
>>> for all those operations, then it should go bOOM.
>>>
>>> But, frankly, that's unlikely, I'm just suggesting that to be sure the
>>> easy case isn't
>>> the problem. Take a memory snapshot at that point just to see, it should
>>> be
>>> a
>>> high-water mark.
>>>
>>> The fact that you increase the heap and can then run for longer is
>>> extremely
>>> suspicious, and really smells like a memory issue, so we'd like to pursue
>>> it.
>>>
>>> I'd be really interested if anyone else is seeing anything similar,
>>> these are the
>>> scary ones...
>>>
>>> Best
>>> Erick
>>>
>>> On Fri, Jun 17, 2011 at 3:09 AM, Bernd Fehling
>>> >
>>> wrote:
>>>
Hi Erik,
I will take some memory snapshots during the next week,
but how can it be to get OOMs with one query?
- I started with 6g for JVM --> 1 day until OOM.
- increased to 8 g --> 2 days until OOM
- increased to 10g --> 3.5 days until OOM
- increased to 16g --> 5 days until OOM
- currently 20g --> about 7 days until OOM
Starting the system takes about 3.5g and goes up to about 4g after a
>>> whi