The first thing I'd do is optimize if you haven't. You may be seeing data
from deleted documents.

And you should pretty much ignore the *.fdt and *.fdx files, those are
where the raw data is stored. That's about the only place changing stored=
would change the data on disk.

Best
Erick


On Thu, Dec 13, 2012 at 2:56 AM, Jie Sun <jsun5...@yahoo.com> wrote:

> I cleaned up the solr schema by change a small portion of the stored fields
> to stored="false".
>
> out for 5000 document (about 500M total size of original documents), I ran
> a
> benchmark comparing the solr index size between the schema before/after the
> clean up.
>
> first time run it showed about 40% reduction of index size (using old
> schema
> the index size is 52M, using new schema the index size is 30M).
>
> However, the second time I added another 5000 documents (similar data but
> different documents) to the index. This time for the total of 10,000
> documents, index size using old schema is 57M, but the index size using new
> schema grows to 54M.
>
> How should I explain what I see, could it be possible the second group of
> 5000 documents have very different data size on the fields that is changed
> to be not stored? or is it because Solr/Lucene's index strategy or
> implementation will have smaller differences on the size of index when the
> number of documents grows?
>
> any input will be appreciated.
> thanks
> Jie
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/how-to-understand-this-benchmark-test-results-compare-index-size-after-schema-change-tp4026674.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Reply via email to