It does end up in the right order (sorted), but it's very expensive.  Sorting 
by a couple fields that each have fewer unique index values seems to limit the 
memory consumption greatly.

-----Original Message-----
From: Walter Underwood [mailto:wunderw...@netflix.com]
Sent: Tuesday, April 07, 2009 11:12 AM
To: solr-user@lucene.apache.org
Subject: Re: Coming up with a model of memory usage

Why tokenize the date? It sorts just fine as a string. --wunder

On 4/7/09 8:50 AM, "Erick Erickson" <erickerick...@gmail.com> wrote:

> Your observations about date sorting are probably correct. The
> issue is that the sort caches in Lucene look at the unique terms.
> There are many more unique terms (nearly every one) in
> 2008-08-12T12:18:26.510
>
> then when the field is split. You can reduce memory consumption
> when sorting even more by splitting into more fields, but that's up
> to you to decide whether or not it's worth the effort....
>
> Best
> Erick
>
> On Tue, Apr 7, 2009 at 10:55 AM, Joe Pollard
> <joe.poll...@bazaarvoice.com>wrote:
>
>> It doesn't seem to matter whether fields are stored or not, but I've
>> found a rather striking difference in the memory requirements during
>> sorting.  Sorting on a string field representing datetime like
>> '2008-08-12T12:18:26.510' is about twice as memory intense as sorting
>> first by '2008-08-12' and then by '121826'.
>>
>> Any other tips/guidance like this would be great!
>>
>> Thanks,
>> -Joe
>>
>> On Mon, 2009-04-06 at 15:43 -0500, Joe Pollard wrote:
>>> To combat our frequent OutOfMemory Exceptions, I'm attempting to come up
>>> with a model so that we can determine how much memory to give Solr based
>>> on how much data we have (as we expand to more data types eligible to be
>>> supported this becomes more important).
>>>
>>> Are there any published guidelines on how much memory a particular
>>> document takes up in memory, based on the data types, etc?
>>>
>>> I have several stored fields, numerous other non-stored fields, a
>>> largish copyTo field, and I am doing some sorting on indexed, non-stored
>>> fields.
>>>
>>> Any pointers would be appreciated!
>>>
>>> Thanks,
>>> -Joe
>>>
>>
>>

Reply via email to