Hi Jonathan, Not sure which version of Solr you are using, but look into Join functionality - hit #1: http://search-lucene.com/?q=join&fc_project=Solr
Otis ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ >________________________________ >From: Jonathan Carothers <jonathan.caroth...@amentra.com> >To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org> >Sent: Thursday, October 20, 2011 1:23 PM >Subject: RE: how to handle large relational data in Solr > >Actually, that's the root of my concern. It looks like it product will >average ~20,000 associated accessories, still workable, but starting to look >painful. Coming back the other way, I would guess each accessory would be >associated with 100 products on average. > >Given that there would be searchable fields in both the product and accessory >data, I assume I would have to either split them into separate indexes and >merge the results, or have one document per product/accessory combo so that I >don't get a mix of accessories matching the search term. For example, if a >product had two accessories, one with the description of "Blue Swing" and >another with "Red Ball" and I did a search for "Red Swing" it would rank about >the same as a document that actually had a "Red Swing". > >So it sounds like you are suggesting the external map, in which case is there >a good way to merge the two searches? Basically on search on product >attributes and a second search on the attributes of related accessories? > >many thanks, >Jonathan >________________________________________ >From: Robert Stewart [bstewart...@gmail.com] >Sent: Thursday, October 20, 2011 12:05 PM >To: solr-user@lucene.apache.org >Subject: Re: how to handle large relational data in Solr > >If your "documents" are products, then 100,000 documents is a pretty small >index for solr. Do you know approximately how many accessories are related to >each product on average? If # if relatively small (around 100 or less), then >it should be ok to create product documents with all the related accessories >as fields on the document, something like: > ><doc> > <field name="id">PRODUCT_ID</field> > <field name="name">PRODUCT_NAME</field> > <field name="accessory">accessory one</field> > <field name="accessory">accessory two</field> > .... > <field name="accessory">accessory N</field> ></doc> > > >And then you can search for products by accessory, and show accessory facets >over products, etc. > >Even if # of accessories per product is large (1000 or more), you can still do >it this way, but it may be better to store some small accessory ID as integers >instead of larger names, and maybe use some external mapping to resolve names >for search and display. > >Bob > > >On Oct 20, 2011, at 11:08 AM, Jonathan Carothers wrote: > >> Agreed, this will just be a read only view of the existing database for >> search purposes. Sorry for the confusion. >> ________________________________________ >> From: Brandon Ramirez [brandon_rami...@elementk.com] >> Sent: Thursday, October 20, 2011 10:50 AM >> To: solr-user@lucene.apache.org >> Subject: RE: how to handle large relational data in Solr >> >> I would not recommend removing your relational database altogether. You >> should treat that as your system of record. By replacing it, you are >> forcing Solr to store the unmodified value for everything even when not >> needed. You also lose normalization. And if you ever need to add some >> data to your system that isn't search-related, you have no choice but to add >> it to your search index. >> >> >> Brandon Ramirez | Office: 585.214.5413 | Fax: 585.295.4848 >> Software Engineer II | Element K | www.elementk.com >> >> >> -----Original Message----- >> From: Jonathan Carothers [mailto:jonathan.caroth...@amentra.com] >> Sent: Thursday, October 20, 2011 10:12 AM >> To: solr-user@lucene.apache.org >> Subject: how to handle large relational data in Solr >> >> All, >> >> We are attempting to convert a fairly large relational database into Solr >> index(es). >> >> There are ~100,000 products with ~1,000,000 accessories that can be related >> to any number of the products. So if I include the search terms and the >> relationships in the same index, we're looking at a pretty huge index. >> >> If we break it out into three indexes, one for the product search, one for >> the accessories search, and one for their relationship, is there a good way >> to merge the results? >> >> Is there a better way to structure the indexes? >> >> We will have a relational database available if it makes sense to do some >> sort of a hybrid approach. >> >> many thanks, >> Jonathan >> > > > >