The very first thing to try is flatten your data so you don't have to use joins.
I know that goes against your database instincts, but Solr easily handles
millions and millions of documents. So if the cross-product of docs and modules
isn't prohibitive, that's what I'd do first. Then it's just a matter of
forming a search without joins....

Joins run into performance issues when the join field has many unique
values, unfortunately the field people often want to join on is something
like a <uniqueKey> (or PK in RDBMS terms), so be aware of that.

Best
Erick

On Fri, Sep 21, 2012 at 5:46 AM,  <milen.ti...@materna.de> wrote:
> Dear Solr community,
>
> I am rather new to Solr, however I already find it kind of attractive. We are 
> developing a research application, which contains a Solr index with three 
> different kinds of documents, here the basic idea:
>
>
> -          A document of type "doc" consisting of fields id, docid, doctitle 
> and some other metadata
>
> -          A document of type "module" consisting of fields id, modid and text
>
> -          A document of type "docmodule" consisting of fields id, docrefid, 
> modrefid and some metadata about the relation between a document and a 
> module; filed docrefid refers to the id of a "doc" document, while field 
> modrefid contains the id of a "module" document
>
> In other words, in our model there are documents (type "doc") consisting of 
> several modules and there is some characterization of each link between a 
> document and a module.
>
> Almost all fields of a "doc" document are searchable, as well as the text of 
> a module and the metadata of the "docmodule" entries.
>
> We are looking for a fast way to retrieve all modules containing a certain 
> text and associated with a given document, preferably with a single query. 
> This means we want to query the text from a "module" document while we set a 
> restriction on the docrefid from a "docmodule" or the id from a "doc" 
> document. Is this possible by means of the new pseudo joins? Any ideas are 
> highly appreciated!
>
> Thanks in advance!
>
> Milen Tilev
> Master of Science
> Softwareentwickler
> Business Unit Information
> ________________________________________________
>
> MATERNA GmbH
> Information & Communications
>
> Voßkuhle 37
> 44141 Dortmund
> Deutschland
>
> Telefon: +49 231 5599-8257
> Fax: +49 231 5599-98257
> E-Mail: milen.ti...@materna.de<mailto:milen.ti...@materna.de>
>
> www.materna.de<http://www.materna.de/> | 
> Newsletter<http://www.materna.de/newsletter> | 
> Twitter<http://twitter.com/MATERNA_GmbH> | 
> XING<http://www.xing.com/companies/MATERNAGMBH> | 
> Facebook<http://www.facebook.com/maternagmbh>
> ________________________________________________
>
> Sitz der MATERNA GmbH: Voßkuhle 37, 44141 Dortmund
> Geschäftsführer: Dr. Winfried Materna, Helmut an de Meulen, Ralph Hartwig
> Amtsgericht Dortmund HRB 5839
>

Reply via email to