static final class StringIndexCache extends Cache { StringIndexCache(FieldCache wrapper) { super(wrapper); }
@Override protected Object createValue(IndexReader reader, Entry entryKey) throws IOException { String field = StringHelper.intern(entryKey.field); final int[] retArray = new int[reader.maxDoc()]; String[] mterms = new String[reader.maxDoc()+1]; TermDocs termDocs = reader.termDocs(); TermEnum termEnum = reader.terms (new Term (field)); int t = 0; // current term number // an entry for documents that have no terms in this field // should a document with no terms be at top or bottom? // this puts them at the top - if it is changed, FieldDocSortedHitQueue // needs to change as well. mterms[t++] = null; try { do { Term term = termEnum.term(); if (term==null || term.field() != field) break; // store term text // we expect that there is at most one term per document if (t >= mterms.length) throw new RuntimeException ("there are more terms than " + "documents in field \"" + field + "\", but it's impossible to sort on " + "tokenized fields"); mterms[t] = term.text(); termDocs.seek (termEnum); while (termDocs.next()) { retArray[termDocs.doc()] = t; } t++; } while (termEnum.next()); } finally { termDocs.close(); termEnum.close(); } if (t == 0) { // if there are no terms, make the term array // have a single null entry mterms = new String[1]; } else if (t < mterms.length) { // if there are less terms than documents, // trim off the dead array space String[] terms = new String[t]; System.arraycopy (mterms, 0, terms, 0, t); mterms = terms; } StringIndex value = new StringIndex (retArray, mterms); return value; } }; The formula for a String Index fieldcache is essentially the String array of unique terms (which does indeed "size down" at the bottom) and the int array indexing into the String array. Fuad Efendi wrote: > To be correct, I analyzed FieldCache awhile ago and I believed it never > "sizes down"... > > /** > * Expert: The default cache implementation, storing all values in memory. > * A WeakHashMap is used for storage. > * > * <p>Created: May 19, 2004 4:40:36 PM > * > * @since lucene 1.4 > */ > > > Will it size down? Only if we are not faceting (as in SOLR v.1.3)... > > And I am still unsure, Document ID vs. Object Pointer. > > > > > >> I don't understand this: >> >>> so with a ton of docs and a few uniques, you get a temp boost in the RAM >>> reqs until it sizes it down. >>> >> Sizes down??? Why is it called Cache indeed? And how SOLR uses it if it is >> not cache? >> >> > > > -- - Mark http://www.lucidimagination.com