RE: Lucene FieldCache memory requirements

2009-11-03 Thread Fuad Efendi
-Fuad > -Original Message- > From: Michael McCandless [mailto:luc...@mikemccandless.com] > Sent: November-03-09 5:00 AM > To: solr-user@lucene.apache.org > Subject: Re: Lucene FieldCache memory requirements > > On Mon, Nov 2, 2009 at 9:27 PM, Fuad Efendi wrote: > > I b

Re: Lucene FieldCache memory requirements

2009-11-03 Thread Michael McCandless
On Mon, Nov 2, 2009 at 9:27 PM, Fuad Efendi wrote: > I believe this is correct estimate: > >> C. [maxdoc] x [4 bytes ~ (int) Lucene Document ID] >> >>   same as >> [String1_Document_Count + ... + String10_Document_Count + ...] >> x [4 bytes per DocumentID] That's right. Except: as Mark said, you

RE: Lucene FieldCache memory requirements

2009-11-02 Thread Fuad Efendi
FieldCache uses internally WeakHashMap... nothing wrong, but... no any Garbage Collection tuning will help in case if allocated RAM is not enough for replacing Weak** with Strong**, especially for SOLR faceting... 10%-15% CPU taken by GC were reported... -Fuad

RE: Lucene FieldCache memory requirements

2009-11-02 Thread Fuad Efendi
Even in simplistic scenario, when it is Garbage Collected, we still _need_to_be_able_ to allocate enough RAM to FieldCache on demand... linear dependency on document count... > > Hi Mark, > > Yes, I understand it now; however, how will StringIndexCache size down in a > production system facetin

RE: Lucene FieldCache memory requirements

2009-11-02 Thread Fuad Efendi
ll it size down in purely Lucene-based heavy-loaded production system? Especially if this cache is used for query optimizations. > -Original Message- > From: Mark Miller [mailto:markrmil...@gmail.com] > Sent: November-02-09 8:53 PM > To: solr-user@lucene.apache.org > Subject: R

RE: Lucene FieldCache memory requirements

2009-11-02 Thread Fuad Efendi
o be safe, use this in your basic memory estimates: [512Mb ~ 1Gb] + [non_tokenized_fields_count] x [maxdoc] x [8 bytes] -Fuad > -Original Message- > From: Fuad Efendi [mailto:f...@efendi.ca] > Sent: November-02-09 7:37 PM > To: solr-user@lucene.apache.org > Subject: RE: Lucene

Re: Lucene FieldCache memory requirements

2009-11-02 Thread Mark Miller
static final class StringIndexCache extends Cache { StringIndexCache(FieldCache wrapper) { super(wrapper); } @Override protected Object createValue(IndexReader reader, Entry entryKey) throws IOException { String field = StringHelper.intern(entryKey.field);

RE: Lucene FieldCache memory requirements

2009-11-02 Thread Fuad Efendi
To be correct, I analyzed FieldCache awhile ago and I believed it never "sizes down"... /** * Expert: The default cache implementation, storing all values in memory. * A WeakHashMap is used for storage. * * Created: May 19, 2004 4:40:36 PM * * @since lucene 1.4 */ Will it size down? Onl

RE: Lucene FieldCache memory requirements

2009-11-02 Thread Fuad Efendi
PM > To: solr-user@lucene.apache.org > Subject: RE: Lucene FieldCache memory requirements > > Mark, > > I don't understand this: > > so with a ton of docs and a few uniques, you get a temp boost in the RAM > > reqs until it sizes it down. > > Sizes down???

RE: Lucene FieldCache memory requirements

2009-11-02 Thread Fuad Efendi
Mark, I don't understand this: > so with a ton of docs and a few uniques, you get a temp boost in the RAM > reqs until it sizes it down. Sizes down??? Why is it called Cache indeed? And how SOLR uses it if it is not cache? And this: > A pointer for each doc. Why can't we use (int) DocumentID?

RE: Lucene FieldCache memory requirements

2009-11-02 Thread Fuad Efendi
I just did some tests in a completely new index (Slave), sort by low-distributed non-tokenized Field (such as Country) takes milliseconds, but sort (ascending) on tokenized field with heavy distribution took 30 seconds (initially). Second sort (descending) took milliseconds. Generic query *.*; Fiel

Re: Lucene FieldCache memory requirements

2009-11-02 Thread Mark Miller
Fuad Efendi wrote: > Simple field (10 different values: Canada, USA, UK, ...), 64-bit JVM... no > difference between maxdoc and maxdoc + 1 for such estimate... difference is > between 0.4Gb and 1.2Gb... > > I'm not sure I understand - but I didn't mean to imply the +1 on maxdoc meant anything. T

RE: Lucene FieldCache memory requirements

2009-11-02 Thread Fuad Efendi
hope it is (int) Document ID... > -Original Message- > From: Mark Miller [mailto:markrmil...@gmail.com] > Sent: November-02-09 6:52 PM > To: solr-user@lucene.apache.org > Subject: Re: Lucene FieldCache memory requirements > > It also briefly requires more

Re: Lucene FieldCache memory requirements

2009-11-02 Thread Mark Miller
se, this is exceptionally wasteful. >>> > This is probably very common case... I think it should be confirmed by > Lucene developers too... FieldCache is warmed anyway, even when we don't use > SOLR... > > > -Fuad > > > > > > > >

RE: Lucene FieldCache memory requirements

2009-11-02 Thread Fuad Efendi
when we don't use SOLR... -Fuad > -Original Message- > From: Michael McCandless [mailto:luc...@mikemccandless.com] > Sent: November-02-09 6:00 PM > To: solr-user@lucene.apache.org > Subject: Re: Lucene FieldCache memory requirements > > OK I think someone who knows how S

Re: Lucene FieldCache memory requirements

2009-11-02 Thread Michael McCandless
this field) SOLR query for all documents *:* - in this case it will be fully > populated... > > >> Subject: Re: Lucene FieldCache memory requirements >> >> Which FieldCache API are you using?  getStrings?  or getStringIndex >> (which is used, under the hood, if you so

RE: Lucene FieldCache memory requirements

2009-11-02 Thread Fuad Efendi
ect: Re: Lucene FieldCache memory requirements > > Which FieldCache API are you using? getStrings? or getStringIndex > (which is used, under the hood, if you sort by this field). > > Mike > > On Mon, Nov 2, 2009 at 2:27 PM, Fuad Efendi wrote: > > Any thoughts regarding

Re: Lucene FieldCache memory requirements

2009-11-02 Thread Michael McCandless
Which FieldCache API are you using? getStrings? or getStringIndex (which is used, under the hood, if you sort by this field). Mike On Mon, Nov 2, 2009 at 2:27 PM, Fuad Efendi wrote: > Any thoughts regarding the subject? I hope FieldCache doesn't use more than > 6 bytes per document-field insta

RE: Lucene FieldCache memory requirements

2009-11-02 Thread Fuad Efendi
Any thoughts regarding the subject? I hope FieldCache doesn't use more than 6 bytes per document-field instance... I am too lazy to research Lucene source code, I hope someone can provide exact answer... Thanks > Subject: Lucene FieldCache memory requirements > > Hi, > > > Can anyone confirm L