These are the figures I got after indexing 4 and half million documents with both Solr 3.6.1 and 4.1.0 (and optimizing the index at the end).
$ du -h --max-depth=1 67G ./solr410 80G ./solr361 Main contributor to the reduced space consumption is (as expected I guess) the .fdt file: $ ls -lh solr361/*/*/*.fdt 29G solr361/core-tex68bohyrh23qs192adaq-index361/index/_bab.fdt $ ls -lh solr410/*/*/*.fdt 18G solr410/core-tex68bohyz1teef3xsjdaw-index410/index/_23uy.fdt Depends of course on your individual ratio of stored versus indexed-only fields. André ________________________________________ Von: Shawn Heisey [[email protected]] Gesendet: Donnerstag, 24. Januar 2013 16:58 An: [email protected] Betreff: Re: Does solr 4.1 support field compression? On 1/24/2013 8:42 AM, Ken Prows wrote: > I didn't see any mention of field compression in the release notes for > Solr 4.1. Did the ability to automatically compress fields end up > getting added to this release? The concept of compressed fields (an option in schema.xml) that existed in the 1.x versions of Solr (based on Lucene 2.9) was removed in Lucene 3.0. Because Lucene and Solr development were combined, the Solr version after 1.4.1 is 3.1.0, there is no 1.5 or 2.x version of Solr. Solr/Lucene 4.1 compresses all stored field data by default. I don't think there's a way to turn it off at the moment, which is causing performance problems for a small subset of Solr users. When it comes out, Solr 4.2 will also have compressed term vectors. The release note contains this text: Stored fields are compressed. (See http://blog.jpountz.net/post/33247161884/efficient-compressed-stored-fields-with-lucene) It looks like the solr CHANGES.txt file fails to specifically mention LUCENE-4226 <https://issues.apache.org/jira/browse/LUCENE-4226> which implemented compressed stored fields. Thanks, Shawn
