These are the figures I got after indexing 4 and half million documents with 
both Solr 3.6.1 and 4.1.0 (and optimizing the index at the end).

  $ du -h --max-depth=1
  67G   ./solr410
  80G   ./solr361

Main contributor to the reduced space consumption is (as expected I guess) the 
.fdt file:

  $ ls -lh solr361/*/*/*.fdt
  29G solr361/core-tex68bohyrh23qs192adaq-index361/index/_bab.fdt

  $ ls -lh solr410/*/*/*.fdt
  18G solr410/core-tex68bohyz1teef3xsjdaw-index410/index/_23uy.fdt

Depends of course on your individual ratio of stored versus indexed-only fields.

André

________________________________________
Von: Shawn Heisey [[email protected]]
Gesendet: Donnerstag, 24. Januar 2013 16:58
An: [email protected]
Betreff: Re: Does solr 4.1 support field compression?

On 1/24/2013 8:42 AM, Ken Prows wrote:
> I didn't see any mention of field compression in the release notes for
> Solr 4.1. Did the ability to automatically compress fields end up
> getting added to this release?

The concept of compressed fields (an option in schema.xml) that existed
in the 1.x versions of Solr (based on Lucene 2.9) was removed in Lucene
3.0.  Because Lucene and Solr development were combined, the Solr
version after 1.4.1 is 3.1.0, there is no 1.5 or 2.x version of Solr.

Solr/Lucene 4.1 compresses all stored field data by default.  I don't
think there's a way to turn it off at the moment, which is causing
performance problems for a small subset of Solr users.  When it comes
out, Solr 4.2 will also have compressed term vectors.

The release note contains this text:

Stored fields are compressed. (See
http://blog.jpountz.net/post/33247161884/efficient-compressed-stored-fields-with-lucene)

It looks like the solr CHANGES.txt file fails to specifically mention
LUCENE-4226 <https://issues.apache.org/jira/browse/LUCENE-4226> which
implemented compressed stored fields.

Thanks,
Shawn

Reply via email to