Jack, Yes, the queries work fine till I hit the OOM. The fields that start with S_* are strings, F_* are floats, I_* are ints and so so. The dynamic field definitions from schema.xml : <dynamicField name="S_*" type="string" indexed="true" stored="true" omitNorms="true"/> <dynamicField name="I_*" type="sint" indexed="true" stored="true" omitNorms="true"/> <dynamicField name="F_*" type="sfloat" indexed="true" stored="true" omitNorms="true"/> <dynamicField name="D_*" type="date" indexed="true" stored="true" omitNorms="true"/> <dynamicField name="B_*" type="boolean" indexed="true" stored="true" omitNorms="true"/>
*Each FieldCache will be an array with maxdoc entries (your total number of documents - 1.4 million) times the size of the field value or whatever a string reference is in your JVM* So if I understand correct - every field (dynamic or normal) will have its own field cache. The size of the field cache for any field will be (maxDocs * sizeOfField) ? If the field has only 100 unique values, will it occupy (100 * sizeOfField) or will it still be (maxDocs * sizeOfField) ? *Roughly what is the typical or average length of one of your facet field values? And, on average, how many unique terms are there within a typical faceted field?* Each field length may vary from 10 - 30 characters. Average of 20 maybe. Number of unique terms within a faceted field will vary from 100 - 1000. Average of 300. How will the number of unique terms affect performance ? *3 GB sounds like it might not be enough for such heavy use of faceting. It is probably not the 50-70 number, but the 440 or accumulated number across many queries that pushes the memory usage up* I am using jdk1.5.0_14 - 32 bit. With 32 bit jdk, I think there is a limitation that more RAM cannot be allocated. *When you hit OOM, what does the Solr admin stats display say for FieldCache?* I don't have solr deployed as a separate web app. All solr jar files are present in my webapp's WEB-INF\lib directory. I use EmbeddedSolrServer. So is there a way I can get this information that the admin would show ? Thank you for your time. -Rahul On Wed, May 2, 2012 at 5:19 PM, Jack Krupansky <j...@basetechnology.com>wrote: > The FieldCache gets populated the first time a given field is referenced > as a facet and then will stay around forever. So, as additional queries get > executed with different facet fields, the number of FieldCache entries will > grow. > > If I understand what you have said, theses faceted queries do work > initially, but after awhile they stop working with OOM, correct? > > The size of a single FieldCache depends on the field type. Since you are > using dynamic fields, it depends on your "dynamicField" types - which you > have not told us about. From your query I see that your fields start with > "S_" and "F_" - presumably you have dynamic field types "S_*" and "F_*"? > Are they strings, integers, floats, or what? > > Each FieldCache will be an array with maxdoc entries (your total number of > documents - 1.4 million) times the size of the field value or whatever a > string reference is in your JVM. > > String fields will take more space than numeric fields for the FieldCache, > since a separate table is maintained for the unique terms in that field. > Roughly what is the typical or average length of one of your facet field > values? And, on average, how many unique terms are there within a typical > faceted field? > > If you can convert many of these faceted fields to simple integers the > size should go down dramatically, but that depends on your application. > > 3 GB sounds like it might not be enough for such heavy use of faceting. It > is probably not the 50-70 number, but the 440 or accumulated number across > many queries that pushes the memory usage up. > > When you hit OOM, what does the Solr admin stats display say for > FieldCache? > > -- Jack Krupansky > > -----Original Message----- From: Rahul R > Sent: Wednesday, May 02, 2012 2:22 AM > To: solr-user@lucene.apache.org > Subject: Re: Lucene FieldCache - Out of memory exception > > > Here is one sample query that I picked up from the log file : > > q=*%3A*&fq=Category%3A%223__**107%22&fq=S_P1540477699%3A%** > 22MICROCIRCUIT%2C+LINE+**TRANSCEIVERS%22&rows=0&facet=** > true&facet.mincount=1&facet.**limit=2&facet.field=S_** > C1503120369&facet.field=S_**P1406389942&facet.field=S_** > P1430116878&facet.field=S_**P1430116881&facet.field=S_** > P1406453552&facet.field=S_**P1406451296&facet.field=S_** > P1406452465&facet.field=S_**C2968809156&facet.field=S_** > P1406389980&facet.field=S_**P1540477699&facet.field=S_** > P1406389982&facet.field=S_**P1406389984&facet.field=S_** > P1406451284&facet.field=S_**P1406389926&facet.field=S_** > P1424886581&facet.field=S_**P2017662632&facet.field=F_** > P1946367021&facet.field=S_**P1430116884&facet.field=S_** > P2017662620&facet.field=F_**P1406451304&facet.field=F_** > P1406451306&facet.field=F_**P1406451308&facet.field=S_** > P1500901421&facet.field=S_**P1507138990&facet.field=I_** > P1406452433&facet.field=I_**P1406453565&facet.field=I_** > P1406452463&facet.field=I_**P1406453573&facet.field=I_** > P1406451324&facet.field=I_**P1406451288&facet.field=S_** > P1406451282&facet.field=S_**P1406452471&facet.field=S_**P14248866 > 05&facet.field=S_P1946367015&**facet.field=S_P1424886598&** > facet.field=S_P1946367018&**facet.field=S_P1406453556&** > facet.field=S_P1406389932&**facet.field=S_P2017662623&** > facet.field=S_P1406450978&**facet.field=F_P1406452455&** > facet.field=S_P1406389972&**facet.field=S_P1406389974&** > facet.field=S_P1406389986&**facet.field=F_P1946367027&** > facet.field=F_P1406451294&**facet.field=F_P1406451286&** > facet.field=F_P1406451328&**facet.field=S_P1424886593&** > facet.field=S_P1406453567&**facet.field=S_P2017662629&** > facet.field=S_P1406453571&**facet.field=F_P1946367030&** > facet.field=S_P1406453569&**facet.field=S_P2017662626&** > facet.field=S_P1406389978&**facet.field=F_P1946367024 > > My primary question here is, can Solr handle this kind of queries with so > many facet fields. I have tried using both enum and fc for facet.method and > there is no improvement with either. > > Appreciate any help on this. Thank you. > > - Rahul > > > On Mon, Apr 30, 2012 at 2:53 PM, Rahul R <rahul.s...@gmail.com> wrote: > > Hello, >> I am using solr 1.3 with jdk 1.5.0_14 and weblogic 10MP1 application >> server on Solaris. I use embedded solr server. More details : >> Number of docs in solr index : 1.4 million >> Physical size of index : 640MB >> Total number of fields in the index : 700 (99% of these are dynamic >> fields) >> Total number of fields enabled for faceting : 440 >> Avg number of facet fields participating in a faceted query : 50-70 >> Total RAM allocated to weblogic appserver : 3GB (max possible) >> >> In a multi user environment with 3 users using this application for a >> period of around 40 minutes, the application runs out of memory. Analysis >> of the heap dump shows that almost 85% of the memory is retained by the >> FieldCache. Now I understand that the field cache is out of our control >> but >> would appreciate some suggestions on how to handle this issue. >> >> Some questions on this front : >> - some mail threads on this forum seem to indicate that there could be >> some connection between having dynamic fields and usage of FieldCache. Is >> this true ? Most of the fields in my index are dynamic fields. >> - as mentioned above, most of my faceted queries could have around 50-70 >> facet fields (I would do SolrQuery.addFacetField() for around 50-70 fields >> per query). Could this be the source of the problem ? Is this too high for >> solr to support ? >> - Initially, I had a facet.sort defined in solrconfig.xml. Since >> FieldCache builds up on sorting, I even removed the facet.sort and tried, >> but no respite. The behavior is same as before. >> - The document id that I have for each document is quite big (around 50 >> characters on average). Can this be a problem ? I reduced this to around >> 15 >> characters and tried but still there is no improvement. >> - Can the size of the data be a problem ? But on this forum, I see many >> users talking of more than 100 million documents in their index. I have >> only 1.4 million with physical size of 640MB. The physical server on which >> this application is running, has sufficient RAM and CPU. >> - What gets stored in the FieldCache ? Is it the entire document or just >> the document Id ? >> >> >> Any help is much appreciated. Thank you. >> >> regards >> Rahul >> >> >> >> >