The first thing I'd do is optimize if you haven't. You may be seeing data from deleted documents.
And you should pretty much ignore the *.fdt and *.fdx files, those are where the raw data is stored. That's about the only place changing stored= would change the data on disk. Best Erick On Thu, Dec 13, 2012 at 2:56 AM, Jie Sun <jsun5...@yahoo.com> wrote: > I cleaned up the solr schema by change a small portion of the stored fields > to stored="false". > > out for 5000 document (about 500M total size of original documents), I ran > a > benchmark comparing the solr index size between the schema before/after the > clean up. > > first time run it showed about 40% reduction of index size (using old > schema > the index size is 52M, using new schema the index size is 30M). > > However, the second time I added another 5000 documents (similar data but > different documents) to the index. This time for the total of 10,000 > documents, index size using old schema is 57M, but the index size using new > schema grows to 54M. > > How should I explain what I see, could it be possible the second group of > 5000 documents have very different data size on the fields that is changed > to be not stored? or is it because Solr/Lucene's index strategy or > implementation will have smaller differences on the size of index when the > number of documents grows? > > any input will be appreciated. > thanks > Jie > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/how-to-understand-this-benchmark-test-results-compare-index-size-after-schema-change-tp4026674.html > Sent from the Solr - User mailing list archive at Nabble.com. >