The FieldCache gets populated the first time a given field is referenced as a facet and then will stay around forever. So, as additional queries get executed with different facet fields, the number of FieldCache entries will grow.

If I understand what you have said, theses faceted queries do work initially, but after awhile they stop working with OOM, correct?

The size of a single FieldCache depends on the field type. Since you are using dynamic fields, it depends on your "dynamicField" types - which you have not told us about. From your query I see that your fields start with "S_" and "F_" - presumably you have dynamic field types "S_*" and "F_*"? Are they strings, integers, floats, or what?

Each FieldCache will be an array with maxdoc entries (your total number of documents - 1.4 million) times the size of the field value or whatever a string reference is in your JVM.

String fields will take more space than numeric fields for the FieldCache, since a separate table is maintained for the unique terms in that field. Roughly what is the typical or average length of one of your facet field values? And, on average, how many unique terms are there within a typical faceted field?

If you can convert many of these faceted fields to simple integers the size should go down dramatically, but that depends on your application.

3 GB sounds like it might not be enough for such heavy use of faceting. It is probably not the 50-70 number, but the 440 or accumulated number across many queries that pushes the memory usage up.

When you hit OOM, what does the Solr admin stats display say for FieldCache?

-- Jack Krupansky

-----Original Message----- From: Rahul R
Sent: Wednesday, May 02, 2012 2:22 AM
To: solr-user@lucene.apache.org
Subject: Re: Lucene FieldCache - Out of memory exception

Here is one sample query that I picked up from the log file :

q=*%3A*&fq=Category%3A%223__107%22&fq=S_P1540477699%3A%22MICROCIRCUIT%2C+LINE+TRANSCEIVERS%22&rows=0&facet=true&facet.mincount=1&facet.limit=2&facet.field=S_C1503120369&facet.field=S_P1406389942&facet.field=S_P1430116878&facet.field=S_P1430116881&facet.field=S_P1406453552&facet.field=S_P1406451296&facet.field=S_P1406452465&facet.field=S_C2968809156&facet.field=S_P1406389980&facet.field=S_P1540477699&facet.field=S_P1406389982&facet.field=S_P1406389984&facet.field=S_P1406451284&facet.field=S_P1406389926&facet.field=S_P1424886581&facet.field=S_P2017662632&facet.field=F_P1946367021&facet.field=S_P1430116884&facet.field=S_P2017662620&facet.field=F_P1406451304&facet.field=F_P1406451306&facet.field=F_P1406451308&facet.field=S_P1500901421&facet.field=S_P1507138990&facet.field=I_P1406452433&facet.field=I_P1406453565&facet.field=I_P1406452463&facet.field=I_P1406453573&facet.field=I_P1406451324&facet.field=I_P1406451288&facet.field=S_P1406451282&facet.field=S_P1406452471&facet.field=S_P14248866
05&facet.field=S_P1946367015&facet.field=S_P1424886598&facet.field=S_P1946367018&facet.field=S_P1406453556&facet.field=S_P1406389932&facet.field=S_P2017662623&facet.field=S_P1406450978&facet.field=F_P1406452455&facet.field=S_P1406389972&facet.field=S_P1406389974&facet.field=S_P1406389986&facet.field=F_P1946367027&facet.field=F_P1406451294&facet.field=F_P1406451286&facet.field=F_P1406451328&facet.field=S_P1424886593&facet.field=S_P1406453567&facet.field=S_P2017662629&facet.field=S_P1406453571&facet.field=F_P1946367030&facet.field=S_P1406453569&facet.field=S_P2017662626&facet.field=S_P1406389978&facet.field=F_P1946367024

My primary question here is, can Solr handle this kind of queries with so
many facet fields. I have tried using both enum and fc for facet.method and
there is no improvement with either.

Appreciate any help on this. Thank you.

- Rahul


On Mon, Apr 30, 2012 at 2:53 PM, Rahul R <rahul.s...@gmail.com> wrote:

Hello,
I am using solr 1.3 with jdk 1.5.0_14 and weblogic 10MP1 application
server on Solaris. I use embedded solr server. More details :
Number of docs in solr index : 1.4 million
Physical size of index : 640MB
Total number of fields in the index : 700 (99% of these are dynamic fields)
Total number of fields enabled for faceting : 440
Avg number of facet fields participating in a faceted query : 50-70
Total RAM allocated to weblogic appserver : 3GB (max possible)

In a multi user environment with 3 users using this application for a
period of around 40 minutes, the application runs out of memory. Analysis
of the heap dump shows that almost 85% of the memory is retained by the
FieldCache. Now I understand that the field cache is out of our control but
would appreciate some suggestions on how to handle this issue.

Some questions on this front :
- some mail threads on this forum seem to indicate that there could be
some connection between having dynamic fields and usage of FieldCache. Is
this true ? Most of the fields in my index are dynamic fields.
- as mentioned above, most of my faceted queries could have around 50-70
facet fields (I would do SolrQuery.addFacetField() for around 50-70 fields
per query). Could this be the source of the problem ? Is this too high for
solr to support ?
- Initially, I had a facet.sort defined in solrconfig.xml. Since
FieldCache builds up on sorting, I even removed the facet.sort and tried,
but no respite. The behavior is same as before.
- The document id that I have for each document is quite big (around 50
characters on average). Can this be a problem ? I reduced this to around 15
characters and tried but still there is no improvement.
- Can the size of the data be a problem ? But on this forum, I see many
users talking of more than 100 million documents in their index. I have
only 1.4 million with physical size of 640MB. The physical server on which
this application is running, has sufficient RAM and CPU.
- What gets stored in the FieldCache ? Is it the entire document or just
the document Id ?


Any help is much appreciated. Thank you.

regards
Rahul




Reply via email to