On Fri, Oct 1, 2010 at 5:42 PM, Renee Sun <renee_...@mcafee.com> wrote:
> Hi Yonik,
>
> I attached the solrconfig.xml to you in previous post, and we do have
> firstSearch and newSearch hook ups.
>
> I commented them out, all 130 cores loaded up in 1 minute, same as in solr
> 1.3.  total memory took about 1GB. Whereas in 1.3, with hook ups, it took
> about 6.5GB for same amount of data.

For other's reference: here is the warming query (it's the same for
newSearcher too):

<listener event="firstSearcher" class="solr.QuerySenderListener">
<arr name="queries">
<lst>
<str name="q">type:message</str>
<str name="start">0</str>
<str name="rows">10</str>
<str name="sort">message_date desc</str>
</lst>
</arr>
</listener>

The sort field message_date is what will be taking up the memory.

Starting with Lucene 2.9 (which is used in Solr 1.4), searching and
sorting is per-segment.
This is generally beneficial, but in this case I believe it is causing
the extra memory usage because the same date value that would have
been shared across all documents in the fieldcache is now repeated in
each segment it is used in.

One potential fix (that requires you to reindex) is to use the "date"
fieldType as defined in the new 1.4 schema:
    <fieldType name="date" class="solr.TrieDateField" omitNorms="true"
precisionStep="0" positionIncrementGap="0"/>

This will use 8 bytes per document in your index, rather than 4 bytes
per doc + an array of unique string-date values per index.

Trunk (4.0-dev) is also much more efficient at storing string-based
fields in the FieldCache - but that will only help you if you're
comfortable with using development versions.

-Yonik
http://lucenerevolution.org  Lucene/Solr Conference, Boston Oct 7-8

Reply via email to