As I remember, the underlying algorithm enumerates all the unique values in the field when doing the join (or something like that). So when the filed you're joining has many unique values, it performs poorly. Worse, it'll be fine on small data sets, the kind we usually develop with. But then when you put your real data set in it (usually orders of magnitude more data than we test with) you set yourself up for "surprises"....
Trying to use Solr in an RDBMS-like manner is simply not using Solr's strengths. An army of very bright people put a lot of work into making DBs performant in their problem space, work that isn't a amenable for a search engine... FWIW Erick On Tue, Oct 30, 2012 at 6:39 PM, Steven Livingstone Pérez <webl...@hotmail.com> wrote: > Thanks. Can you explain a bit more about your second point below. > Specifically what makes it a bad fit? (design wise, performance)? > > Thanks again. > Steven > > Sent from my Windows Phone > ________________________________ > From: Erick Erickson > Sent: 30/10/2012 22:22 > To: solr-user@lucene.apache.org > Subject: Re: Are there any limitations on multi-value field joins? > > Whenever anyone starts talking about using Solr to perform what > would be multi-way DB joins I break out in hives. > > First of all, the "limited join capability" in Solr only returns the > values from ONE of the documents. There's no way to return values > from both the from and to documents. > > Second, Solr's join capability is a poor fit if the fields being joined have > many unique values, so that's something to be careful of.... > > I'd advise that you see if you can flatten (de-normalize) your data such > that you can make simple queries rather than try to use Solr like you > would a DB... > > FWIW, > Erick > > On Tue, Oct 30, 2012 at 7:20 AM, Steven Livingstone Pérez > <webl...@hotmail.com> wrote: >> Hi - I've done quite a bit of Googling and reading but can't find a >> definitive answer to this. >> I would like to have a list of key data rows each with a unique id and some >> data. >> datarow1 a b cdatarow2 x y zdatarow3 m n o... >> I'd then like to have other rows that point to one or more of they data rows >> that have a multi-valued field that can contain one or many of the unique >> id's above. >> User1 datarow1, datarow2, datarow3 etcUser2 datarow4, datarow21, datarow43 >> etc... >> Then i will join from the user1 row to the data row. >> My question is simply are there *any* limitation on doing this kind of join? >> I believe there are some geo-spatial issues and sorting (i don't need to >> sort on the id) but before i jump fully into this approach i've like to >> understand anything i may run into - or whether it is better to have them as >> individual rows and join them that way. >> many thanks,/steven