Re: Taxonomy in SOLR

Damien Fontaine Mon, 24 Jan 2011 05:43:16 -0800


Le 24/01/2011 13:10, Em a écrit :

Hi Daniem,

ahm, the formula I wrote was no definitive guide, just some numbers I
combined to visualize the amount of data - perhaps not even a complete
formula.

Well, when you can use your taxonomy as indexed-only you do not double the
used disk space when you are indexing two equal documents.

So, five document or 4 mi with the same taxonomy are equal in using diskspace to one ?

Lucene - and also Solr - are working with an inverted index: This means
every document is mapped against its indexed terms.
So your index-size will depend on the number of unique taxonomy-terms and
the pointers of the documents to these terms. That's it. Usually the used
disk-space for an index is much smaller than the size of the original data.

I hope what I tried to explain was easy to understand.

Thanks, it's very helpfull !

How i can find more explaination on the internal structure of the Luceneindexer ?


Damien

Re: Taxonomy in SOLR

Reply via email to