Hi Yonik,

I am still using pre-2.9 Lucene (taken from SOLR trunk two months ago).

2048 is limit for documents, not for array of pointers to documents. And
especially for new "uninverted" SOLR features, plus non-tokenized stored
fields, we need 1Gb to store 1Mb of a simple field only (size of field: 1000
bytes).

May be it would broke... frankly, I started with 8Gb, then by some reason I
set if to 2Gb (a month ago), I don't remember why... I had hardware problems
and I didn't want frequent loose of ram buffer...


But again: why it would broke? Because "int" has 2048M different values?!! 

This is extremely strange. My understanding is that "buffer" stores
processed data such as "term -> document_id" values, _per_field_array(s!!!);
so that 2048M is _absolute_maximum_ in case if your SOLR schema consists
from _single_tokenized_field_only_. What about 10 fields? What about plain
text stored with document, term vectors, "uninverted" values??? What are
reasons on putting such check in Lucene? Array overflow?


-Fuad
http://www.linkedin.com/in/liferay



> -----Original Message-----
> From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik
Seeley
> Sent: October-24-09 12:27 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Too many open files
> 
> On Sat, Oct 24, 2009 at 12:18 PM, Fuad Efendi <f...@efendi.ca> wrote:
> >
> > Mark, I don't understand this; of course it is use case specific, I
haven't
> > seen any terrible behaviour with 8Gb
> 
> If you had gone over 2GB of actual buffer *usage*, it would have
> broke...  Guaranteed.
> We've now added a check in Lucene 2.9.1 that will throw an exception
> if you try to go over 2048MB.
> And as the javadoc says, to be on the safe side, you probably
> shouldn't go too near 2048 - perhaps 2000MB is a good practical limit.
> 
> -Yonik
> http://www.lucidimagination.com


Reply via email to