Hi Jonathan,

Not sure which version of Solr you are using, but look into Join functionality 
- hit #1: http://search-lucene.com/?q=join&fc_project=Solr

Otis
----

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/


>________________________________
>From: Jonathan Carothers <jonathan.caroth...@amentra.com>
>To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>
>Sent: Thursday, October 20, 2011 1:23 PM
>Subject: RE: how to handle large relational data in Solr
>
>Actually, that's the root of my concern.  It looks like it product will 
>average ~20,000 associated accessories, still workable, but starting to look 
>painful.  Coming back the other way, I would guess each accessory would be 
>associated with 100 products on average.
>
>Given that there would be searchable fields in both the product and accessory 
>data, I assume I would have to either split them  into separate indexes and 
>merge the results, or have one document per product/accessory combo so that I 
>don't get a mix of accessories matching the search term.  For example, if a 
>product had two accessories, one with the description of "Blue Swing" and 
>another with "Red Ball" and I did a search for "Red Swing" it would rank about 
>the same as a document that actually had a "Red Swing".
>
>So it sounds like you are suggesting the external map, in which case is there 
>a good way to merge the two searches?  Basically on search on product 
>attributes and a second search on the attributes of related accessories?
>
>many thanks,
>Jonathan
>________________________________________
>From: Robert Stewart [bstewart...@gmail.com]
>Sent: Thursday, October 20, 2011 12:05 PM
>To: solr-user@lucene.apache.org
>Subject: Re: how to handle large relational data in Solr
>
>If your "documents" are products, then 100,000 documents is a pretty small 
>index for solr.  Do you know approximately how many accessories are related to 
>each product on average?  If # if relatively small (around 100 or less), then 
>it should be ok to create product documents with all the related accessories 
>as fields on the document, something like:
>
><doc>
>        <field name="id">PRODUCT_ID</field>
>        <field name="name">PRODUCT_NAME</field>
>        <field name="accessory">accessory one</field>
>        <field name="accessory">accessory two</field>
>        ....
>        <field name="accessory">accessory N</field>
></doc>
>
>
>And then you can search for products by accessory, and show accessory facets 
>over products, etc.
>
>Even if # of accessories per product is large (1000 or more), you can still do 
>it this way, but it may be better to store some small accessory ID as integers 
>instead of larger names, and maybe use some external mapping to resolve names 
>for search and display.
>
>Bob
>
>
>On Oct 20, 2011, at 11:08 AM, Jonathan Carothers wrote:
>
>> Agreed, this will just be a read only view of the existing database for 
>> search purposes.  Sorry for the confusion.
>> ________________________________________
>> From: Brandon Ramirez [brandon_rami...@elementk.com]
>> Sent: Thursday, October 20, 2011 10:50 AM
>> To: solr-user@lucene.apache.org
>> Subject: RE: how to handle large relational data in Solr
>>
>> I would not recommend removing your relational database altogether.  You 
>> should treat that as your system of record.  By replacing it, you are 
>> forcing Solr to store the unmodified value for everything even when not 
>> needed.  You also lose normalization.   And if you ever need to add some 
>> data to your system that isn't search-related, you have no choice but to add 
>> it to your search index.
>>
>>
>> Brandon Ramirez | Office: 585.214.5413 | Fax: 585.295.4848
>> Software Engineer II | Element K | www.elementk.com
>>
>>
>> -----Original Message-----
>> From: Jonathan Carothers [mailto:jonathan.caroth...@amentra.com]
>> Sent: Thursday, October 20, 2011 10:12 AM
>> To: solr-user@lucene.apache.org
>> Subject: how to handle large relational data in Solr
>>
>> All,
>>
>> We are attempting to convert a fairly large relational database into Solr 
>> index(es).
>>
>> There are ~100,000 products with ~1,000,000 accessories that can be related 
>> to any number of the products.  So if I include the search terms and the 
>> relationships in the same index, we're looking at a pretty huge index.
>>
>> If we break it out into three indexes, one for the product search, one for 
>> the accessories search, and one for their relationship, is there a good way 
>> to merge the results?
>>
>> Is there a better way to structure the indexes?
>>
>> We will have a relational database available if it makes sense to do some 
>> sort of a hybrid approach.
>>
>> many thanks,
>> Jonathan
>>
>
>
>
>

Reply via email to