On Thu, Dec 2, 2010 at 4:31 PM, Burton-West, Tom wrote:
> We turned on infostream. Is there documentation about how to interpret it,
> or should I just grep through the codebase?
There isn't any documentation... and it changes over time as we add
new diagnostics.
> Is the excerpt below what
On Wed, Dec 1, 2010 at 3:01 PM, Shawn Heisey wrote:
> I have seen this. In Solr 1.4.1, the .fdt, .fdx, and the .tv* files do not
> segment, but all the other files do. I can't remember whether it behaves
> the same under 3.1, or whether it also creates these files in each segment.
Yep, that's t
-user@lucene.apache.org
Subject: Re: ramBufferSizeMB not reflected in segment sizes in index
On Wed, Dec 1, 2010 at 3:16 PM, Burton-West, Tom wrote:
> Thanks Mike,
>
> Yes we have many unique terms due to dirty OCR and 400 languages and probably
> lots of low doc freq terms as well (altho
On Wed, Dec 1, 2010 at 3:16 PM, Burton-West, Tom wrote:
> Thanks Mike,
>
> Yes we have many unique terms due to dirty OCR and 400 languages and probably
> lots of low doc freq terms as well (although with the ICUTokenizer and
> ICUFoldingFilter we should get fewer terms due to bad tokenization a
n the production indexer. If it doesn't I'll turn it on and post
here.
Tom
-Original Message-
From: Michael McCandless [mailto:luc...@mikemccandless.com]
Sent: Wednesday, December 01, 2010 2:43 PM
To: solr-user@lucene.apache.org
Subject: Re: ramBufferSizeMB not reflected in s
On 12/1/2010 12:13 PM, Burton-West, Tom wrote:
We have set the ramBufferSizeMB to 320 in both the indexDefaults and the
mainIndex sections of our solrconfig.xml:
320
20
We expected that this would mean that the index would not write to disk until
it reached somewhere approximately over 300MB
The ram efficiency (= size of segment once flushed divided by size of
RAM buffer) can vary drastically.
Because the in-RAM data structures must be "growable" (to append new
docs to the postings as they are encountered), the efficiency is never
100%. I think 50% is actually a "good" ram efficiency
We are using a recent Solr 3.x (See below for exact version).
We have set the ramBufferSizeMB to 320 in both the indexDefaults and the
mainIndex sections of our solrconfig.xml:
320
20
We expected that this would mean that the index would not write to disk until
it reached somewhere approximate