Bharani wrote:
Hi,

I have got two sets of document

1) Primary Document
2) Occurrences of primary document

Since there is no such thing as "join" i can either
a) Post the primary document with occurrences as multi valued field
 or
b) Post the primary document for every occurrences i.e. classic
de-normalized route

My problem with
Option a) This works great as long as the occurrence is a single field but
if i had a group of fields that describes the occurrence then the search
returns wrong results becuase of the nature of text search

i.e <date>1 Jan 2007</date>
<type> review</type>

<date> 2 Jan 2007 </date>
<type> revision</type>

if i search for 2 Jan 2007 and <date> 1 Jan 2007 </date> i will get a hit
(which is wrong)  becuase there is no grouping of fields to associate date
and type as one unit. If i merge them as one entity then i cant use the
range quieries for date

Option B) This would result in large number of documents and even if i try
with index only and not store i am still have to deal with duplicate hit -
becuase all i want is the primary document


Is there a better approach to the problem?

Are you concerned about the size of your index?

One of the difficulties that you're going to find with multi-valued fields is that they are an unordered collection without relation. If you have a document with a list of editors and revisions, the two fields have no inherent correlation unless your application can extract it from the data itself.

[doc]
   [id]123[/id]
   [str name=name]hello world[/str]
   [array name=editor]
       [str name=editor]Fred[/str]
       [str name=editor]Bob[/str]
   [/array]
   [array name=revisiondate]
      [date name=revisiondate]2006-01-01T00:00:00Z[/date]
      [date name=revisiondate]2006-01-02T00:00:00Z[/date]
   [/array]
[/doc]

If your application can decipher that and do a slice on it showing a revision...then brilliant! But if the multi-value fields are out of order, that might make a significant different.

I would create a document per revision and take advantage of range queries and sorting available at the query level.




Jed

Reply via email to