On Fri, Oct 1, 2010 at 5:42 PM, Renee Sun <renee_...@mcafee.com> wrote: > Hi Yonik, > > I attached the solrconfig.xml to you in previous post, and we do have > firstSearch and newSearch hook ups. > > I commented them out, all 130 cores loaded up in 1 minute, same as in solr > 1.3. total memory took about 1GB. Whereas in 1.3, with hook ups, it took > about 6.5GB for same amount of data.
For other's reference: here is the warming query (it's the same for newSearcher too): <listener event="firstSearcher" class="solr.QuerySenderListener"> <arr name="queries"> <lst> <str name="q">type:message</str> <str name="start">0</str> <str name="rows">10</str> <str name="sort">message_date desc</str> </lst> </arr> </listener> The sort field message_date is what will be taking up the memory. Starting with Lucene 2.9 (which is used in Solr 1.4), searching and sorting is per-segment. This is generally beneficial, but in this case I believe it is causing the extra memory usage because the same date value that would have been shared across all documents in the fieldcache is now repeated in each segment it is used in. One potential fix (that requires you to reindex) is to use the "date" fieldType as defined in the new 1.4 schema: <fieldType name="date" class="solr.TrieDateField" omitNorms="true" precisionStep="0" positionIncrementGap="0"/> This will use 8 bytes per document in your index, rather than 4 bytes per doc + an array of unique string-date values per index. Trunk (4.0-dev) is also much more efficient at storing string-based fields in the FieldCache - but that will only help you if you're comfortable with using development versions. -Yonik http://lucenerevolution.org Lucene/Solr Conference, Boston Oct 7-8