On Tue, Jan 25, 2011 at 2:06 PM, Markus Jelsma
<markus.jel...@openindex.io>wrote:

> On Tuesday 25 January 2011 11:54:55 Martin Grotzke wrote:
> > Hi,
> >
> > recently we're experiencing OOMEs (GC overhead limit exceeded) in our
> > searches. Therefore I want to get some clarification on heap and cache
> > configuration.
> >
> > This is the situation:
> > - Solr 1.4.1 running on tomcat 6, Sun JVM 1.6.0_13 64bit
> > - JVM Heap Params: -Xmx8G -XX:MaxPermSize=256m -XX:NewSize=2G
> > -XX:MaxNewSize=2G -XX:SurvivorRatio=6 -XX:+UseParallelOldGC
> > -XX:+UseParallelGC
>
> Consider switching to HotSpot JVM, use the -server as the first switch.

The jvm options I mentioned were not all, we're running the jvm with -server
(of course).


>
> > - The machine has 32 GB RAM
> > - Currently there are 4 processors/cores in the machine, this shall be
> > changed to 2 cores in the future.
> > - The index size in the filesystem is ~9.5 GB
> > - The index contains ~ 5.500.000 documents
> > - 1.500.000 of those docs are available for searches/queries, the rest
> are
> > inactive docs that are excluded from searches (via a flag/field), but
> > they're still stored in the index as need to be available by id (solr is
> > the main document store in this app)
>
> How do you exclude them? It should use filter queries.

The docs are indexed with a field "findable" on which we do a filter query.


> I also remember (but i
> just cannot find it back so please correct my if i'm wrong) that in 1.4.x
> sorting is done before filtering. It should be an improvement if filtering
> is
> done before sorting.
>
Hmm, I cannot imagine a case where it makes sense to sort before filtering.
Can't believe that solr does it like this.
Can anyone shed a light on this?


> If you use sorting, it takes up a huge amount of RAM if filtering is not
> done
> first.
>
> > - Caches are configured with a big size (the idea was to prevent
> filesystem
> > access / disk i/o as much as possible):
>
> There is only disk I/O if the kernel can't keep the index (or parts) in its
> page cache.
>
Yes, I'll keep an eye on disk I/O.



> >   - filterCache (solr.LRUCache): size=200000, initialSize=30000,
> > autowarmCount=1000, actual size =~ 60.000, hitratio =~ 0.99
> >   - documentCache (solr.LRUCache): size=200000, initialSize=100000,
> > autowarmCount=0, actual size =~ 160.000 - 190.000, hitratio =~ 0.74
> >   - queryResultCache (solr.LRUCache): size=200000, initialSize=30000,
> > autowarmCount=10000, actual size =~ 10.000 - 60.000, hitratio =~ 0.71
>
> You should decrease the initialSize values. But your hitratio's seem very
> nice.
>
Does the initialSize have a real impact? According to
http://wiki.apache.org/solr/SolrCaching#initialSize it's the initial size of
the HashMap backing the cache.
What would you say are reasonable values for size/initialSize/autowarmCount?

Cheers,
Martin

Reply via email to