: Subject: Date faceting and memory leaks

First off, just to be clear, you don't seem to be useing the "date 
faceting" feature, you are using the "Facet Query" feature, your queries 
just so happen to be on a date field.

Second: to help people help you, you need to provide all the details.  
you've shown us the "appends" section of your request handler config, but 
you havne't given us any other details about the queries -- what does the 
*full* configuration look like for this handler?  what do all the test 
urls look like? etc...  You also haven't given us any other details 
about your solr setup.  in particularly, knowing what your cache 
configurations look like is crucial.

: I have been running load testing using JMeter on a Solr 1.4 index with ~4
: million docs. I notice a steady JVM heap size increase as I iterator 100
: query terms a number of times against the index. The GC does not seems to
: claim the heap after the test run is completed. It will run into OutOfMemory

Third: how *exactly* are you measuring/monitoring "heap size" ? ... you 
won't neccessarily see the Heap decrease in size, even after GC.

Forth: what do you cache sizes (and cache hit rates look like 
before/during/after your test run?  I ask aout this specificly because the 
queries you have configured don't do any date rounding, which means Solr 
will attempt to cache a differnet range query for each of your hard coded 
facet.query ranges every millisecond that it recieves a request...

:     <str name="facet.query">{!ex=last_modified}last_modified:[NOW-30DAY TO
: *]</str>

...so you might want to consider changing those to things like...

   <str name="facet.query">{!ex=last_modified}last_modified:[NOW/DAY-90DAY TO 
NOW/DAY-30DAY]</str>

...if what you care about is "day" precision.  presumably in your requests 
you have an "fq" that is "taged" with the name "last_modified" ? (see what 
i mean about needing all the details, i'm just guessing here based on what 
i know) ... you'll want that to "round" down to the start of the day as 
well.

These unique queries for every millisecond could easily explain getting an 
OOM if your filterCache is very large (since i don't know how big your 
filterCache is, or what kind of cache hit rates you are getting, i can 
only guess)

: I have played for filterCache setting but does not have any effects as the
: date field cache seems be  managed by Lucene FieldCahce.

no.  a fieldCache is created for each field as needed (mainly for sorting, 
and in some cases for field term faceting) but for "facet.query"s like 
these (and for hte corrisponding "fq"s) an entry in the filterCache is 
created for each unique query.


-Hoss

Reply via email to