Hello, I am trying to understand how I can size the caches for my solr powered application. Some details on the index and application : Solr Version : 1.3 JDK : 1.5.0_14 32 bit OS : Solaris 10 App Server : Weblogic 10 MP1 Number of documents : 1 million Total number of fields : 1000 (750 strings, 225 int/float/double/long, 25 boolean) Number of fields on which faceting and filtering can be done : 400 Physical size of index : 600MB Number of unique values for a field : Ranges from 5 - 1000. Average of 150 -Xms and -Xmx vals for jvm : 3G Expected number of concurrent users : 15 No sorting planned for now
Now I want to set appropriate values for the caches. I have put below some of my understanding and questions about the caches. Please correct and answer accordingly. FilterCache: As per the solr wiki, this is used to store an unordered list of Ids of matching documents for an fq param. So if a query contains two fq params, it will create two separate entries for each of these fq params. The value of each entry is the list of ids of all documents across the index that match the corresponding fq param. Each entry is independent of any other entry. A minimum size for filterCache could be (total number of fields * avg number of unique values per field) ? Is this correct ? I have not enabled <useFilterForSortedQuery>. Max physical size of the filter cache would be (size * avg byte size of a document id * avg number of docs returned per fq param) ? QueryResultsCache: Used to store an ordered list of ids of the documents that match the most commonly used searches. So if my query is something like q=Status:Active&fq=Org:Apache&fq=Version:13, it will create one entry that contains list of ids of documents that match this full query. Is this correct ? How can I size my queryResultsCache ? Some entries from solrconfig.xml : <queryResultWindowSize>50</queryResultWindowSize> <queryResultMaxDocsCached>200</queryResultMaxDocsCached> Max physical size of the filterCache would be (size * avg byte size of a document id * avg number of docs per query). Is this correct ? documentCache: Stores the documents that are stored in the index. So I do two searches that return three documents each with 1 document being common between both result sets. This will result in 5 entries in the documentCache for the 5 unique documents that have been returned for the two queries ? Is this correct ? For sizing, SolrWiki states that "*The size for the documentCache should always be greater than <max_results> * <max_concurrent_queries>*". Why do we need the max_concurrent_queries parameter here ? Is it when max_results is much lesser than numDocs ? In my case, a q=*:*search is done the first time the index is loaded. So, will setting documentCache size to numDocs be correct ? Can this be like the max that I need to allocate ? Max physical size of document cache would be (size * avg byte size of a document in the index). Is this correct ? Thank you -Rahul