Very informative links and statement Jonathan. thank you.


On 6 June 2011 20:55, Jonathan Rochkind <rochk...@jhu.edu> wrote:

> This is a start, for many common best practices:
>
> http://wiki.apache.org/solr/SolrRelevancyFAQ
>
> Many of the questions in there have an answer that involves de-normalizing.
> As an example. It may be that even if your specific problem isn't in there,
>  I myself anyway found reading through there gave me a general sense of
> common patterns in Solr.
>
> ( It's certainly true that some things are hard to do in Solr.  It turns
> out that an RDBMS is a remarkably flexible thing -- but when it doesn't do
> something you need well, and you turn to a specialized tool instead like
> Solr, you certainly give up some things
>
> One of the biggest areas of limitation involves hieararchical or
> relationship data, definitely. There are a variety of features, some more
> fully baked than others, some not yet in a Solr release, meant to provide
> tools to get at different aspects of this. Including "pivot facetting",
>  "join" (https://issues.apache.org/jira/browse/SOLR-2272), and
> field-collapsing.  Each, IMO, is trying to deal with different aspects of
> dealing with hieararchical or multi-class data, or data that is entities
> with relationships. ).
>
>
> On 6/6/2011 3:43 PM, Judioo wrote:
>
>> I do think that Solr would be better served if there was a *best practice
>> section *of the site.
>>
>> Looking at the majority of emails to this list they resolve around "how do
>> I
>> do X?".
>>
>> Seems like tutorials with real world examples would serve Solr no end of
>> good.
>>
>> I still do not have an example of the best method to approach my problem,
>> although Erick has  help me understand the limitations of Solr.
>>
>> Just thought I'd say.
>>
>>
>>
>>
>>
>>
>> On 6 June 2011 20:26, Judioo<cont...@judioo.com>  wrote:
>>
>>  Thanks
>>>
>>>
>>> On 6 June 2011 19:32, Erick Erickson<erickerick...@gmail.com>  wrote:
>>>
>>>  #Everybody# (including me) who has any RDBMS background
>>>> doesn't want to flatten data, but that's usually the way to go in
>>>> Solr.
>>>>
>>>> Part of whether it's a good idea or not depends on how big the index
>>>> gets, and unfortunately the only way to figure that out is to test.
>>>>
>>>> But that's the first approach I'd try.
>>>>
>>>> Good luck!
>>>> Erick
>>>>
>>>> On Mon, Jun 6, 2011 at 11:42 AM, Judioo<cont...@judioo.com>  wrote:
>>>>
>>>>> On 5 June 2011 14:42, Erick Erickson<erickerick...@gmail.com>  wrote:
>>>>>
>>>>>  See: http://wiki.apache.org/solr/SchemaXml
>>>>>>
>>>>>> By adding ' "multiValued="true" ' to the field, you can add
>>>>>> the same field multiple times in a doc, something like
>>>>>>
>>>>>> <add>
>>>>>> <doc>
>>>>>>  <field name="mv">value1</field>
>>>>>>  <field name="mv">value2</field>
>>>>>> </doc>
>>>>>> </add>
>>>>>>
>>>>>> I can't see how that would work as one would need to associate the
>>>>>>
>>>>> right
>>>>
>>>>> start / end dates and price.
>>>>> As I understand using multivalued and thus flattening the  discounts
>>>>>
>>>> would
>>>>
>>>>> result in:
>>>>>
>>>>> {
>>>>>    "name":"The Book",
>>>>>    "price":"$9.99",
>>>>>    "price":"$3.00",
>>>>>    "price":"$4.00",    "synopsis":"thanksgiving special",
>>>>>    "starts":"11-24-2011",
>>>>>    "starts":"10-10-2011",
>>>>>    "ends":"11-25-2011",
>>>>>    "ends":"10-11-2011",
>>>>>    "synopsis":"Canadian thanksgiving special",
>>>>>  },
>>>>>
>>>>> How does one differentiate the different offers?
>>>>>
>>>>>
>>>>>
>>>>>  But there's no real ability  in Solr to store "sub documents",
>>>>>> so you'd have to get creative in how you encoded the discounts...
>>>>>>
>>>>>>  This is what I'm asking :)
>>>>> What is the best / recommended / known patterns for doing this?
>>>>>
>>>>>
>>>>>
>>>>>  But I suspect a better approach would be to store each discount as
>>>>>> a separate document. If you're in the trunk version, you could then
>>>>>> group results by, say, ISBN and get responses grouped together...
>>>>>>
>>>>>>  This is an option but seems sub optimal. So say I store the discounts
>>>>> in
>>>>> multiple documents with ISDN as an attribute and also store the title
>>>>>
>>>> again
>>>>
>>>>> with ISDN as an attribute.
>>>>>
>>>>> To get
>>>>> "all books currently discounted"
>>>>>
>>>>> requires 2 request
>>>>>
>>>>> * get all discounts currently active
>>>>> * get all books  using ISDN retrieved from above search
>>>>>
>>>>> Not that bad. However what happens when I want
>>>>> "all books that are currently on discount in the "horror" genre
>>>>>
>>>> containing
>>>>
>>>>> the word 'elm' in the title."
>>>>>
>>>>> The only way I can see in catering for the above search is to duplicate
>>>>>
>>>> all
>>>>
>>>>> searchable fields in my "book" document in my "discount" document.
>>>>>
>>>> Coming
>>>>
>>>>> from a RDBM background this seems wrong.
>>>>>
>>>>> Is this the correct approach to take?
>>>>>
>>>>>
>>>>>
>>>>>  Best
>>>>>> Erick
>>>>>>
>>>>>> On Sat, Jun 4, 2011 at 1:42 AM, Judioo<cont...@judioo.com>  wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>> Discounts can change daily. Also there can be a lot of them (over
>>>>>>>
>>>>>> time
>>>>
>>>>> and
>>>>>>
>>>>>>> in a given time period ).
>>>>>>>
>>>>>>> Could you give an example of what you mean buy multi-valuing the
>>>>>>>
>>>>>> field.
>>>>
>>>>> Thanks
>>>>>>>
>>>>>>> On 3 June 2011 14:29, Erick Erickson<erickerick...@gmail.com>
>>>>>>>
>>>>>> wrote:
>>>>
>>>>>  How often are the discounts changed? Because you can simply
>>>>>>>> re-index the book information with a multiValued "discounts" field
>>>>>>>> and get something similar to your example (&wt=json)....
>>>>>>>>
>>>>>>>>
>>>>>>>> Best
>>>>>>>> Erick
>>>>>>>>
>>>>>>>> On Fri, Jun 3, 2011 at 8:38 AM, Judioo<cont...@judioo.com>  wrote:
>>>>>>>>
>>>>>>>>> What is the "best practice" method to index the following in Solr:
>>>>>>>>>
>>>>>>>>> I'm attempting to use solr for a book store site.
>>>>>>>>>
>>>>>>>>> Each book will have a price but on occasions this will be
>>>>>>>>>
>>>>>>>> discounted.
>>>>
>>>>> The
>>>>>>
>>>>>>> discounted price exists for a defined time period but there may be
>>>>>>>>>
>>>>>>>> many
>>>>>>
>>>>>>> discount periods. Each discount will have a brief synopsis, start
>>>>>>>>>
>>>>>>>> and
>>>>
>>>>> end
>>>>>>
>>>>>>> time.
>>>>>>>>>
>>>>>>>>> A subset of the desired output would be as follows:
>>>>>>>>>
>>>>>>>>> .......
>>>>>>>>> "response":{"numFound":1,"start":0,"docs":[
>>>>>>>>>  {
>>>>>>>>>    "name":"The Book",
>>>>>>>>>    "price":"$9.99",
>>>>>>>>>    "discounts":[
>>>>>>>>>        {
>>>>>>>>>         "price":"$3.00",
>>>>>>>>>         "synopsis":"thanksgiving special",
>>>>>>>>>         "starts":"11-24-2011",
>>>>>>>>>         "ends":"11-25-2011",
>>>>>>>>>        },
>>>>>>>>>        {
>>>>>>>>>         "price":"$4.00",
>>>>>>>>>         "synopsis":"Canadian thanksgiving special",
>>>>>>>>>         "starts":"10-10-2011",
>>>>>>>>>         "ends":"10-11-2011",
>>>>>>>>>        },
>>>>>>>>>     ]
>>>>>>>>>  },
>>>>>>>>>  .........
>>>>>>>>>
>>>>>>>>> A requirement is to be able to search for just discounted
>>>>>>>>>
>>>>>>>> publications. I
>>>>>>
>>>>>>> think I could use date faceting for this ( return publications
>>>>>>>>>
>>>>>>>> that
>>>>
>>>>> are
>>>>>>
>>>>>>> within a discount window ). When a discount search is performed no
>>>>>>>>> publications that are not currently discounted will be returned.
>>>>>>>>>
>>>>>>>>> My question are:
>>>>>>>>>
>>>>>>>>>   - Does solr support this type of sub documents
>>>>>>>>>
>>>>>>>>> In the above example the discounts are the sub documents. I know
>>>>>>>>>
>>>>>>>> solr
>>>>
>>>>> is
>>>>>>
>>>>>>> not
>>>>>>>>
>>>>>>>>> a relational DB but I would like to store and index the above
>>>>>>>>>
>>>>>>>> representation
>>>>>>>>
>>>>>>>>> in a single document if possible.
>>>>>>>>>
>>>>>>>>>   - what is the best method to approach the above
>>>>>>>>>
>>>>>>>>> I can see in many examples the authors tend to denormalize to
>>>>>>>>>
>>>>>>>> solve
>>>>
>>>>>  similar
>>>>>>>>
>>>>>>>>> problems. This suggest that for each discount I am required to
>>>>>>>>>
>>>>>>>> duplicate
>>>>>>
>>>>>>> the
>>>>>>>>
>>>>>>>>> book data or form a document
>>>>>>>>> association<
>>>>>>>>>
>>>>>>>> http://stackoverflow.com/questions/2689399/solr-associations
>>>>>>
>>>>>>> .
>>>>>>>>> Which method would you advise?
>>>>>>>>>
>>>>>>>>> It would be nice if solr could return a response structured as
>>>>>>>>>
>>>>>>>> above.
>>>>
>>>>>  Much Thanks
>>>>>>>>>
>>>>>>>>>
>>>

Reply via email to