Very informative links and statement Jonathan. thank you.
On 6 June 2011 20:55, Jonathan Rochkind <rochk...@jhu.edu> wrote: > This is a start, for many common best practices: > > http://wiki.apache.org/solr/SolrRelevancyFAQ > > Many of the questions in there have an answer that involves de-normalizing. > As an example. It may be that even if your specific problem isn't in there, > I myself anyway found reading through there gave me a general sense of > common patterns in Solr. > > ( It's certainly true that some things are hard to do in Solr. It turns > out that an RDBMS is a remarkably flexible thing -- but when it doesn't do > something you need well, and you turn to a specialized tool instead like > Solr, you certainly give up some things > > One of the biggest areas of limitation involves hieararchical or > relationship data, definitely. There are a variety of features, some more > fully baked than others, some not yet in a Solr release, meant to provide > tools to get at different aspects of this. Including "pivot facetting", > "join" (https://issues.apache.org/jira/browse/SOLR-2272), and > field-collapsing. Each, IMO, is trying to deal with different aspects of > dealing with hieararchical or multi-class data, or data that is entities > with relationships. ). > > > On 6/6/2011 3:43 PM, Judioo wrote: > >> I do think that Solr would be better served if there was a *best practice >> section *of the site. >> >> Looking at the majority of emails to this list they resolve around "how do >> I >> do X?". >> >> Seems like tutorials with real world examples would serve Solr no end of >> good. >> >> I still do not have an example of the best method to approach my problem, >> although Erick has help me understand the limitations of Solr. >> >> Just thought I'd say. >> >> >> >> >> >> >> On 6 June 2011 20:26, Judioo<cont...@judioo.com> wrote: >> >> Thanks >>> >>> >>> On 6 June 2011 19:32, Erick Erickson<erickerick...@gmail.com> wrote: >>> >>> #Everybody# (including me) who has any RDBMS background >>>> doesn't want to flatten data, but that's usually the way to go in >>>> Solr. >>>> >>>> Part of whether it's a good idea or not depends on how big the index >>>> gets, and unfortunately the only way to figure that out is to test. >>>> >>>> But that's the first approach I'd try. >>>> >>>> Good luck! >>>> Erick >>>> >>>> On Mon, Jun 6, 2011 at 11:42 AM, Judioo<cont...@judioo.com> wrote: >>>> >>>>> On 5 June 2011 14:42, Erick Erickson<erickerick...@gmail.com> wrote: >>>>> >>>>> See: http://wiki.apache.org/solr/SchemaXml >>>>>> >>>>>> By adding ' "multiValued="true" ' to the field, you can add >>>>>> the same field multiple times in a doc, something like >>>>>> >>>>>> <add> >>>>>> <doc> >>>>>> <field name="mv">value1</field> >>>>>> <field name="mv">value2</field> >>>>>> </doc> >>>>>> </add> >>>>>> >>>>>> I can't see how that would work as one would need to associate the >>>>>> >>>>> right >>>> >>>>> start / end dates and price. >>>>> As I understand using multivalued and thus flattening the discounts >>>>> >>>> would >>>> >>>>> result in: >>>>> >>>>> { >>>>> "name":"The Book", >>>>> "price":"$9.99", >>>>> "price":"$3.00", >>>>> "price":"$4.00", "synopsis":"thanksgiving special", >>>>> "starts":"11-24-2011", >>>>> "starts":"10-10-2011", >>>>> "ends":"11-25-2011", >>>>> "ends":"10-11-2011", >>>>> "synopsis":"Canadian thanksgiving special", >>>>> }, >>>>> >>>>> How does one differentiate the different offers? >>>>> >>>>> >>>>> >>>>> But there's no real ability in Solr to store "sub documents", >>>>>> so you'd have to get creative in how you encoded the discounts... >>>>>> >>>>>> This is what I'm asking :) >>>>> What is the best / recommended / known patterns for doing this? >>>>> >>>>> >>>>> >>>>> But I suspect a better approach would be to store each discount as >>>>>> a separate document. If you're in the trunk version, you could then >>>>>> group results by, say, ISBN and get responses grouped together... >>>>>> >>>>>> This is an option but seems sub optimal. So say I store the discounts >>>>> in >>>>> multiple documents with ISDN as an attribute and also store the title >>>>> >>>> again >>>> >>>>> with ISDN as an attribute. >>>>> >>>>> To get >>>>> "all books currently discounted" >>>>> >>>>> requires 2 request >>>>> >>>>> * get all discounts currently active >>>>> * get all books using ISDN retrieved from above search >>>>> >>>>> Not that bad. However what happens when I want >>>>> "all books that are currently on discount in the "horror" genre >>>>> >>>> containing >>>> >>>>> the word 'elm' in the title." >>>>> >>>>> The only way I can see in catering for the above search is to duplicate >>>>> >>>> all >>>> >>>>> searchable fields in my "book" document in my "discount" document. >>>>> >>>> Coming >>>> >>>>> from a RDBM background this seems wrong. >>>>> >>>>> Is this the correct approach to take? >>>>> >>>>> >>>>> >>>>> Best >>>>>> Erick >>>>>> >>>>>> On Sat, Jun 4, 2011 at 1:42 AM, Judioo<cont...@judioo.com> wrote: >>>>>> >>>>>>> Hi, >>>>>>> Discounts can change daily. Also there can be a lot of them (over >>>>>>> >>>>>> time >>>> >>>>> and >>>>>> >>>>>>> in a given time period ). >>>>>>> >>>>>>> Could you give an example of what you mean buy multi-valuing the >>>>>>> >>>>>> field. >>>> >>>>> Thanks >>>>>>> >>>>>>> On 3 June 2011 14:29, Erick Erickson<erickerick...@gmail.com> >>>>>>> >>>>>> wrote: >>>> >>>>> How often are the discounts changed? Because you can simply >>>>>>>> re-index the book information with a multiValued "discounts" field >>>>>>>> and get something similar to your example (&wt=json).... >>>>>>>> >>>>>>>> >>>>>>>> Best >>>>>>>> Erick >>>>>>>> >>>>>>>> On Fri, Jun 3, 2011 at 8:38 AM, Judioo<cont...@judioo.com> wrote: >>>>>>>> >>>>>>>>> What is the "best practice" method to index the following in Solr: >>>>>>>>> >>>>>>>>> I'm attempting to use solr for a book store site. >>>>>>>>> >>>>>>>>> Each book will have a price but on occasions this will be >>>>>>>>> >>>>>>>> discounted. >>>> >>>>> The >>>>>> >>>>>>> discounted price exists for a defined time period but there may be >>>>>>>>> >>>>>>>> many >>>>>> >>>>>>> discount periods. Each discount will have a brief synopsis, start >>>>>>>>> >>>>>>>> and >>>> >>>>> end >>>>>> >>>>>>> time. >>>>>>>>> >>>>>>>>> A subset of the desired output would be as follows: >>>>>>>>> >>>>>>>>> ....... >>>>>>>>> "response":{"numFound":1,"start":0,"docs":[ >>>>>>>>> { >>>>>>>>> "name":"The Book", >>>>>>>>> "price":"$9.99", >>>>>>>>> "discounts":[ >>>>>>>>> { >>>>>>>>> "price":"$3.00", >>>>>>>>> "synopsis":"thanksgiving special", >>>>>>>>> "starts":"11-24-2011", >>>>>>>>> "ends":"11-25-2011", >>>>>>>>> }, >>>>>>>>> { >>>>>>>>> "price":"$4.00", >>>>>>>>> "synopsis":"Canadian thanksgiving special", >>>>>>>>> "starts":"10-10-2011", >>>>>>>>> "ends":"10-11-2011", >>>>>>>>> }, >>>>>>>>> ] >>>>>>>>> }, >>>>>>>>> ......... >>>>>>>>> >>>>>>>>> A requirement is to be able to search for just discounted >>>>>>>>> >>>>>>>> publications. I >>>>>> >>>>>>> think I could use date faceting for this ( return publications >>>>>>>>> >>>>>>>> that >>>> >>>>> are >>>>>> >>>>>>> within a discount window ). When a discount search is performed no >>>>>>>>> publications that are not currently discounted will be returned. >>>>>>>>> >>>>>>>>> My question are: >>>>>>>>> >>>>>>>>> - Does solr support this type of sub documents >>>>>>>>> >>>>>>>>> In the above example the discounts are the sub documents. I know >>>>>>>>> >>>>>>>> solr >>>> >>>>> is >>>>>> >>>>>>> not >>>>>>>> >>>>>>>>> a relational DB but I would like to store and index the above >>>>>>>>> >>>>>>>> representation >>>>>>>> >>>>>>>>> in a single document if possible. >>>>>>>>> >>>>>>>>> - what is the best method to approach the above >>>>>>>>> >>>>>>>>> I can see in many examples the authors tend to denormalize to >>>>>>>>> >>>>>>>> solve >>>> >>>>> similar >>>>>>>> >>>>>>>>> problems. This suggest that for each discount I am required to >>>>>>>>> >>>>>>>> duplicate >>>>>> >>>>>>> the >>>>>>>> >>>>>>>>> book data or form a document >>>>>>>>> association< >>>>>>>>> >>>>>>>> http://stackoverflow.com/questions/2689399/solr-associations >>>>>> >>>>>>> . >>>>>>>>> Which method would you advise? >>>>>>>>> >>>>>>>>> It would be nice if solr could return a response structured as >>>>>>>>> >>>>>>>> above. >>>> >>>>> Much Thanks >>>>>>>>> >>>>>>>>> >>>