Re: Sort by date field = outofmemory?

Lance Norskog Sat, 14 Jul 2012 17:42:49 -0700

Sorting requires an array of 4-byte ints, one for each document. If
the field is a number or date, this is the only overhead. 80M docs * 4
bytes = 320 mbytes for each sorted field. If it is something else like
a string, Lucene also creates an array with one of every unique value.


If your query result sets are small, you can sort on a function. This
does not create these large array.

On Thu, Jul 12, 2012 at 8:09 AM, Erick Erickson <erickerick...@gmail.com> wrote:
> Bruno:
>
> You can also reduce your memory requirements by storing fewer unique values.
> All the _unique_ values for a field in the index are read in for
> sorting. People often
> store timestamps in milliseconds, which essentially means that every
> document has
> a unique value.
>
> Storing your timestamps in the coarsest granularity that suits your use-case 
> is
> always a good idea, see the date math:
> http://lucene.apache.org/solr/api-4_0_0-ALPHA/org/apache/solr/util/DateMathParser.html
>
> Best
> Erick
>
> On Wed, Jul 11, 2012 at 12:44 PM, Yury Kats <yuryk...@yahoo.com> wrote:
>> This solves the problem by allocating memory up front, instead of at some
>> point later when JVM needs it. At that later point in time there may not
>> be enough free memory left on the system to allocate.
>>
>> On 7/11/2012 11:04 AM, Michael Della Bitta wrote:
>>> There is a school of thought that suggests you should always set Xms
>>> and Xmx to the same thing if you expect your heap to hit Xms. This
>>> results in your process only needing to allocate the memory once,
>>> rather in a series of little allocations as the heap expands.
>>>
>>> I can't explain how this fixed your problem, but just a datapoint that
>>> might suggest that doing what you did is not such a bad thing.
>>>
>>> Michael Della Bitta
>>>
>>> ------------------------------------------------
>>> Appinions, Inc. -- Where Influence Isn’t a Game.
>>> http://www.appinions.com
>>>
>>>
>>> On Wed, Jul 11, 2012 at 4:05 AM, Bruno Mannina <bmann...@free.fr> wrote:
>>>> Hi, some news this morning...
>>>>
>>>> I added -Xms1024m option and now it works?! no outofmemory ?!
>>>>
>>>> java -jar -Xms1024m -Xmx2048m start.jar
>>>>
>>>> Le 11/07/2012 09:55, Bruno Mannina a écrit :
>>>>
>>>>> Hi Yury,
>>>>>
>>>>> Thanks for your anwer.
>>>>>
>>>>> ok for to increase memory but I have a problem with that,
>>>>> I have 8Go on my computer but the JVM accepts only 2Go max with the option
>>>>> -Xmx
>>>>> is it normal?
>>>>>
>>>>> Thanks,
>>>>> Bruno
>>>>>
>>>>> Le 11/07/2012 03:42, Yury Kats a écrit :
>>>>>>
>>>>>> Sorting is a memory-intensive operation indeed.
>>>>>> Not sure what you are asking, but it may very well be that your
>>>>>> only option is to give JVM more memory.
>>>>>>
>>>>>> On 7/10/2012 8:25 AM, Bruno Mannina wrote:
>>>>>>>
>>>>>>> Dear Solr Users,
>>>>>>>
>>>>>>> Each time I try to do a request with &sort=pubdate+desc....
>>>>>>>
>>>>>>> I get:
>>>>>>> GRAVE: java.lang.OutOfMemoryError: Java heap space
>>>>>>>
>>>>>>> I use Solr3.6, I have around 80M docs and my request gets around 160
>>>>>>> results.
>>>>>>>
>>>>>>> Actually for my test, i use jetty
>>>>>>>
>>>>>>> java -jar -Xmx2g start.jar
>>>>>>>
>>>>>>> PS: If I write 3g i get an error, I have 8go Ram
>>>>>>>
>>>>>>> Thanks a lot for your help,
>>>>>>> Bruno
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>



-- 
Lance Norskog
goks...@gmail.com

Re: Sort by date field = outofmemory?

Reply via email to