Thank you guys for responses.

Some background on the task:
The problem we are trying to solve with Solr is the following. 
We have to provide a full-text search over documents that partially consist of 
fields that are always there and partially of additional metadata as key-value 
pairs where keys are not known beforehand. Yet we need to be able to search on 
the content of that additional meta-data.

Becuase we have to provide FTS abilities we have used Solr and not a HashMap or 
some BigTable.
To address the "optionality" of additional metadata fields and their 
searcheability we have decided to use Solr indexed dynamic fields. 

Questions:
1. Yonik, will your approach work for us with next data:
doc1
  uniqueFields:["100=boo foo roo","101=bar bar 100 boo"]
doc2
  uniqueFields:["101=boo roo","102=bar foo 101 boo"]
and we want to fetch documents that contain value 'foo' in metadata with field 
key: 100? (that is only doc1 should be returned)

2. Should I post issue to JIRA about large index size, or it's expected 
behaviour in our case?

Thanks, Ivan
 


________________________________________
From: ysee...@gmail.com [ysee...@gmail.com] On Behalf Of Yonik Seeley 
[yo...@lucidimagination.com]
Sent: Thursday, November 10, 2011 10:22 PM
To: solr-user@lucene.apache.org
Subject: Re: [Solr-3.4] Norms file size is large in case of many unique indexed 
fields in index

On Thu, Nov 10, 2011 at 7:42 AM, Ivan Hrytsyuk
<ihryts...@softserveinc.com> wrote:
> For 5000 documents (every document has 2 unique fields, 2*5000=10000
> unique fields in index), index size is 48.24 MB.

You might be able to turn this around and encode the "unique field"
information in a multi-valued field:

For example, instead of
  myUniqueField100:"foo"  myUniqueField101:"bar"
you could do
  uniqueFields:["100=foo","101=bar"]

The exact details depend on how you are going to use/query these
fields of course.

-Yonik
http://www.lucidimagination.com

Reply via email to