Re: [Solr-3.4] Norms file size is large in case of many unique indexed fields in index

Robert Muir Thu, 10 Nov 2011 12:12:40 -0800

what is the point of a unique indexed field?

If for all of your fields, there is only one possible document, you
don't need length normalization, scoring, or a search engine at all...
just use a HashMap?


On Thu, Nov 10, 2011 at 7:42 AM, Ivan Hrytsyuk
<ihryts...@softserveinc.com> wrote:
> Hello everyone,
>
> We have large index size in case norms are enabled.
>
> schema.xml:
>
> type declaration:
> <fieldType name="simpleTokenizer" class="solr.TextField"
> positionIncrementGap="100" omitNorms="false">
>     <analyzer>
>         <tokenizer class="solr.KeywordTokenizerFactory" />
>     </analyzer>
> </fieldType>
>
> fields declaration:
> <field name="id" stored="true" indexed="true" required="true"
> type="string" />
> <field name="name" stored="true" indexed="true" type="string" />
> <dynamicField name="unique_*" stored="false" indexed="true"
> type="simpleTokenizer" multiValued="false" />
>
> For 5000 documents (every document has 2 unique fields, 2*5000=10000
> unique fields in index), index size is 48.24 MB.
> But if we enable omitting norms (omitNorms="true"), index size is 0.56
> MB.
>
> Next, if we increase number of unique fields per document to 3
> (3*5000=15000 unique fields in index) we receive: 72.23 MB and 0.70 MB
> respectively.
> And if we increase number of documents to 10000 ( 3*10000 unique fields
> in index) we receive: 287.54 MB and 1.44 MB respectively.
>
> We've prepared test application to reproduce mentioned behavior. It can
> be downloaded here:
> https://bitbucket.org/coldserenity/solr-large-index-with-norms
>
> Could anyone point out if size of index is as expected in mentioned
> cases? And if it's, what configuration can be applied to reduce size of
> index.
>
> Thank you in advance, Ivan
>



-- 
lucidimagination.com

Re: [Solr-3.4] Norms file size is large in case of many unique indexed fields in index

Reply via email to