This is a start, for many common best practices:

http://wiki.apache.org/solr/SolrRelevancyFAQ

Many of the questions in there have an answer that involves de-normalizing. As an example. It may be that even if your specific problem isn't in there, I myself anyway found reading through there gave me a general sense of common patterns in Solr.

( It's certainly true that some things are hard to do in Solr. It turns out that an RDBMS is a remarkably flexible thing -- but when it doesn't do something you need well, and you turn to a specialized tool instead like Solr, you certainly give up some things

One of the biggest areas of limitation involves hieararchical or relationship data, definitely. There are a variety of features, some more fully baked than others, some not yet in a Solr release, meant to provide tools to get at different aspects of this. Including "pivot facetting", "join" (https://issues.apache.org/jira/browse/SOLR-2272), and field-collapsing. Each, IMO, is trying to deal with different aspects of dealing with hieararchical or multi-class data, or data that is entities with relationships. ).

On 6/6/2011 3:43 PM, Judioo wrote:
I do think that Solr would be better served if there was a *best practice
section *of the site.

Looking at the majority of emails to this list they resolve around "how do I
do X?".

Seems like tutorials with real world examples would serve Solr no end of
good.

I still do not have an example of the best method to approach my problem,
although Erick has  help me understand the limitations of Solr.

Just thought I'd say.






On 6 June 2011 20:26, Judioo<cont...@judioo.com>  wrote:

Thanks


On 6 June 2011 19:32, Erick Erickson<erickerick...@gmail.com>  wrote:

#Everybody# (including me) who has any RDBMS background
doesn't want to flatten data, but that's usually the way to go in
Solr.

Part of whether it's a good idea or not depends on how big the index
gets, and unfortunately the only way to figure that out is to test.

But that's the first approach I'd try.

Good luck!
Erick

On Mon, Jun 6, 2011 at 11:42 AM, Judioo<cont...@judioo.com>  wrote:
On 5 June 2011 14:42, Erick Erickson<erickerick...@gmail.com>  wrote:

See: http://wiki.apache.org/solr/SchemaXml

By adding ' "multiValued="true" ' to the field, you can add
the same field multiple times in a doc, something like

<add>
<doc>
  <field name="mv">value1</field>
  <field name="mv">value2</field>
</doc>
</add>

I can't see how that would work as one would need to associate the
right
start / end dates and price.
As I understand using multivalued and thus flattening the  discounts
would
result in:

{
    "name":"The Book",
    "price":"$9.99",
    "price":"$3.00",
    "price":"$4.00",    "synopsis":"thanksgiving special",
    "starts":"11-24-2011",
    "starts":"10-10-2011",
    "ends":"11-25-2011",
    "ends":"10-11-2011",
    "synopsis":"Canadian thanksgiving special",
  },

How does one differentiate the different offers?



But there's no real ability  in Solr to store "sub documents",
so you'd have to get creative in how you encoded the discounts...

This is what I'm asking :)
What is the best / recommended / known patterns for doing this?



But I suspect a better approach would be to store each discount as
a separate document. If you're in the trunk version, you could then
group results by, say, ISBN and get responses grouped together...

This is an option but seems sub optimal. So say I store the discounts in
multiple documents with ISDN as an attribute and also store the title
again
with ISDN as an attribute.

To get
"all books currently discounted"

requires 2 request

* get all discounts currently active
* get all books  using ISDN retrieved from above search

Not that bad. However what happens when I want
"all books that are currently on discount in the "horror" genre
containing
the word 'elm' in the title."

The only way I can see in catering for the above search is to duplicate
all
searchable fields in my "book" document in my "discount" document.
Coming
from a RDBM background this seems wrong.

Is this the correct approach to take?



Best
Erick

On Sat, Jun 4, 2011 at 1:42 AM, Judioo<cont...@judioo.com>  wrote:
Hi,
Discounts can change daily. Also there can be a lot of them (over
time
and
in a given time period ).

Could you give an example of what you mean buy multi-valuing the
field.
Thanks

On 3 June 2011 14:29, Erick Erickson<erickerick...@gmail.com>
wrote:
How often are the discounts changed? Because you can simply
re-index the book information with a multiValued "discounts" field
and get something similar to your example (&wt=json)....


Best
Erick

On Fri, Jun 3, 2011 at 8:38 AM, Judioo<cont...@judioo.com>  wrote:
What is the "best practice" method to index the following in Solr:

I'm attempting to use solr for a book store site.

Each book will have a price but on occasions this will be
discounted.
The
discounted price exists for a defined time period but there may be
many
discount periods. Each discount will have a brief synopsis, start
and
end
time.

A subset of the desired output would be as follows:

.......
"response":{"numFound":1,"start":0,"docs":[
  {
    "name":"The Book",
    "price":"$9.99",
    "discounts":[
        {
         "price":"$3.00",
         "synopsis":"thanksgiving special",
         "starts":"11-24-2011",
         "ends":"11-25-2011",
        },
        {
         "price":"$4.00",
         "synopsis":"Canadian thanksgiving special",
         "starts":"10-10-2011",
         "ends":"10-11-2011",
        },
     ]
  },
  .........

A requirement is to be able to search for just discounted
publications. I
think I could use date faceting for this ( return publications
that
are
within a discount window ). When a discount search is performed no
publications that are not currently discounted will be returned.

My question are:

   - Does solr support this type of sub documents

In the above example the discounts are the sub documents. I know
solr
is
not
a relational DB but I would like to store and index the above
representation
in a single document if possible.

   - what is the best method to approach the above

I can see in many examples the authors tend to denormalize to
solve
similar
problems. This suggest that for each discount I am required to
duplicate
the
book data or form a document
association<
http://stackoverflow.com/questions/2689399/solr-associations
.
Which method would you advise?

It would be nice if solr could return a response structured as
above.
Much Thanks


Reply via email to