Thanks On 6 June 2011 19:32, Erick Erickson <erickerick...@gmail.com> wrote:
> #Everybody# (including me) who has any RDBMS background > doesn't want to flatten data, but that's usually the way to go in > Solr. > > Part of whether it's a good idea or not depends on how big the index > gets, and unfortunately the only way to figure that out is to test. > > But that's the first approach I'd try. > > Good luck! > Erick > > On Mon, Jun 6, 2011 at 11:42 AM, Judioo <cont...@judioo.com> wrote: > > On 5 June 2011 14:42, Erick Erickson <erickerick...@gmail.com> wrote: > > > >> See: http://wiki.apache.org/solr/SchemaXml > >> > >> By adding ' "multiValued="true" ' to the field, you can add > >> the same field multiple times in a doc, something like > >> > >> <add> > >> <doc> > >> <field name="mv">value1</field> > >> <field name="mv">value2</field> > >> </doc> > >> </add> > >> > >> I can't see how that would work as one would need to associate the right > > start / end dates and price. > > As I understand using multivalued and thus flattening the discounts > would > > result in: > > > > { > > "name":"The Book", > > "price":"$9.99", > > "price":"$3.00", > > "price":"$4.00", "synopsis":"thanksgiving special", > > "starts":"11-24-2011", > > "starts":"10-10-2011", > > "ends":"11-25-2011", > > "ends":"10-11-2011", > > "synopsis":"Canadian thanksgiving special", > > }, > > > > How does one differentiate the different offers? > > > > > > > >> But there's no real ability in Solr to store "sub documents", > >> so you'd have to get creative in how you encoded the discounts... > >> > > > > This is what I'm asking :) > > What is the best / recommended / known patterns for doing this? > > > > > > > >> > >> But I suspect a better approach would be to store each discount as > >> a separate document. If you're in the trunk version, you could then > >> group results by, say, ISBN and get responses grouped together... > >> > > > > This is an option but seems sub optimal. So say I store the discounts in > > multiple documents with ISDN as an attribute and also store the title > again > > with ISDN as an attribute. > > > > To get > > "all books currently discounted" > > > > requires 2 request > > > > * get all discounts currently active > > * get all books using ISDN retrieved from above search > > > > Not that bad. However what happens when I want > > "all books that are currently on discount in the "horror" genre > containing > > the word 'elm' in the title." > > > > The only way I can see in catering for the above search is to duplicate > all > > searchable fields in my "book" document in my "discount" document. Coming > > from a RDBM background this seems wrong. > > > > Is this the correct approach to take? > > > > > > > >> > >> Best > >> Erick > >> > >> On Sat, Jun 4, 2011 at 1:42 AM, Judioo <cont...@judioo.com> wrote: > >> > Hi, > >> > Discounts can change daily. Also there can be a lot of them (over time > >> and > >> > in a given time period ). > >> > > >> > Could you give an example of what you mean buy multi-valuing the > field. > >> > > >> > Thanks > >> > > >> > On 3 June 2011 14:29, Erick Erickson <erickerick...@gmail.com> wrote: > >> > > >> >> How often are the discounts changed? Because you can simply > >> >> re-index the book information with a multiValued "discounts" field > >> >> and get something similar to your example (&wt=json).... > >> >> > >> >> > >> >> Best > >> >> Erick > >> >> > >> >> On Fri, Jun 3, 2011 at 8:38 AM, Judioo <cont...@judioo.com> wrote: > >> >> > What is the "best practice" method to index the following in Solr: > >> >> > > >> >> > I'm attempting to use solr for a book store site. > >> >> > > >> >> > Each book will have a price but on occasions this will be > discounted. > >> The > >> >> > discounted price exists for a defined time period but there may be > >> many > >> >> > discount periods. Each discount will have a brief synopsis, start > and > >> end > >> >> > time. > >> >> > > >> >> > A subset of the desired output would be as follows: > >> >> > > >> >> > ....... > >> >> > "response":{"numFound":1,"start":0,"docs":[ > >> >> > { > >> >> > "name":"The Book", > >> >> > "price":"$9.99", > >> >> > "discounts":[ > >> >> > { > >> >> > "price":"$3.00", > >> >> > "synopsis":"thanksgiving special", > >> >> > "starts":"11-24-2011", > >> >> > "ends":"11-25-2011", > >> >> > }, > >> >> > { > >> >> > "price":"$4.00", > >> >> > "synopsis":"Canadian thanksgiving special", > >> >> > "starts":"10-10-2011", > >> >> > "ends":"10-11-2011", > >> >> > }, > >> >> > ] > >> >> > }, > >> >> > ......... > >> >> > > >> >> > A requirement is to be able to search for just discounted > >> publications. I > >> >> > think I could use date faceting for this ( return publications that > >> are > >> >> > within a discount window ). When a discount search is performed no > >> >> > publications that are not currently discounted will be returned. > >> >> > > >> >> > My question are: > >> >> > > >> >> > - Does solr support this type of sub documents > >> >> > > >> >> > In the above example the discounts are the sub documents. I know > solr > >> is > >> >> not > >> >> > a relational DB but I would like to store and index the above > >> >> representation > >> >> > in a single document if possible. > >> >> > > >> >> > - what is the best method to approach the above > >> >> > > >> >> > I can see in many examples the authors tend to denormalize to solve > >> >> similar > >> >> > problems. This suggest that for each discount I am required to > >> duplicate > >> >> the > >> >> > book data or form a document > >> >> > association< > >> http://stackoverflow.com/questions/2689399/solr-associations > >> >> >. > >> >> > Which method would you advise? > >> >> > > >> >> > It would be nice if solr could return a response structured as > above. > >> >> > > >> >> > Much Thanks > >> >> > > >> >> > >> > > >> > > >