Re: Solr and Garbage Collection

Mark Miller Sat, 26 Sep 2009 09:55:20 -0700

Jonathan Ariel wrote:
> I have around 8M documents.
>   
Thats actually not so bad - I take it you are faceting/sorting on quite
a few unique fields?


> I set up my server to use a different collector and it seems like it
> decreased from 11% to 4%, of course I need to wait a bit more because it is
> just a 1 hour old log. But it seems like it is much better now.
> I will tell you on Monday the results :)
>   
Are you still seeing major collections then? (eg the tenured space hits
its limit) You might be able to get even better.
> On Fri, Sep 25, 2009 at 6:07 PM, Mark Miller <markrmil...@gmail.com> wrote:
>
>   
>> Thats a good point too - if you can reduce your need for such a large
>> heap, by all means, do so.
>>
>> However, considering you already need at least 10GB or you get OOM, you
>> have a long way to go with that approach. Good luck :)
>>
>> How many docs do you have ? I'm guessing its mostly FieldCache type
>> stuff, and thats the type of thing you can't really side step, unless
>> you give up the functionality thats using it.
>>
>> Grant Ingersoll wrote:
>>     
>>> On Sep 25, 2009, at 9:30 AM, Jonathan Ariel wrote:
>>>
>>>       
>>>> Hi to all!
>>>> Lately my solr servers seem to stop responding once in a while. I'm
>>>> using
>>>> solr 1.3.
>>>> Of course I'm having more traffic on the servers.
>>>> So I logged the Garbage Collection activity to check if it's because of
>>>> that. It seems like 11% of the time the application runs, it is stopped
>>>> because of GC. And some times the GC takes up to 10 seconds!
>>>> Is is normal? My instances run on a 16GB RAM, Dual Quad Core Intel Xeon
>>>> servers. My index is around 10GB and I'm giving to the instances 10GB of
>>>> RAM.
>>>>
>>>> How can I check which is the GC that it is being used? If I'm right JVM
>>>> Ergonomics should use the Throughput GC, but I'm not 100% sure. Do
>>>> you have
>>>> any recommendation on this?
>>>>         
>>> As I said in Eteve's thread on JVM settings, some extra time spent on
>>> application design/debugging will save a whole lot of headache in
>>> Garbage Collection and trying to tune the gazillion different options
>>> available.  Ask yourself:  What is on the heap and does it need to be
>>> there?  For instance, do you, if you have them, really need sortable
>>> ints?   If your servers seem to come to a stop, I'm going to bet you
>>> have major collections going on.  Major collections in a production
>>> system are very bad.  They tend to happen right after commits in
>>> poorly tuned systems, but can also happen in other places if you let
>>> things build up due to really large heaps and/or things like really
>>> large cache settings.  I would pull up jConsole and have a look at
>>> what is happening when the pauses occur.  Is it a major collection?
>>> If so, then hook up a heap analyzer or a profiler and see what is on
>>> the heap around those times.  Then have a look at your schema/config,
>>> etc. and see if there are things that are memory intensive (sorting,
>>> faceting, excessively large filter caches).
>>>
>>> --------------------------
>>> Grant Ingersoll
>>> http://www.lucidimagination.com/
>>>
>>> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
>>> using Solr/Lucene:
>>> http://www.lucidimagination.com/search
>>>
>>>       
>> --
>> - Mark
>>
>> http://www.lucidimagination.com
>>
>>
>>
>>
>>     
>
>   


-- 
- Mark

http://www.lucidimagination.com

Re: Solr and Garbage Collection

Reply via email to