In my current use case I have 4 tables with a one to many relationship between them (one is the parent and the rest are the children ) and I have created for each table a separate Solr core. Now I have the request to return all those parents that match a certain criteria or one of its children match the same criteria or a different criteria. Given the fact that moving all these documents in a single core implies more changes in the current code than keeping the cores as they are I considered also the solution with union of cross joins. Next I performed some tests and saw that having join in a single core does not add too much compared to union of cross join, hence I don't know which solution to adopt. Do you see a use case where I would "hit the wall" if I keep the documents in separate cores?
BTW the link bellow does not work (I have found it while searching this topic) , it displays an empty page. Thanks, Mihaela ________________________________ From: Mikhail Khludnev <mkhlud...@griddynamics.com> To: solr-user <solr-user@lucene.apache.org>; mihaela olteanu <mihaela...@yahoo.com> Sent: Thursday, July 11, 2013 2:25 PM Subject: Re: Performance of cross join vs block join Mihaela, For me it's reasonable that single core join takes the same time as cross core one. I just can't see which gain can be obtained from in the former case. I hardly able to comment join code, I looked into, it's not trivial, at least. With block join it doesn't need to obtain parentId term values/numbers and lookup parents by them. Both of these actions are expensive. Also blockjoin works as an iterator, but join need to allocate memory for parents bitset and populate it out of order that impacts scalability. Also in None scoring mode BJQ don't need to walk through all children, but only hits first. Also, nice feature is 'both side leapfrog' if you have a highly restrictive filter/query intersects with BJQ, it allows to skip many parents and children as well, that's not possible in Join, which has fairly 'full-scan' nature. Main performance factor for Join is number of child docs. I'm not sure I got all your questions, please specify them in more details, if something is still unclear. have you saw my benchmark http://blog.griddynamics.com/2012/08/block-join-query-performs.html ? On Thu, Jul 11, 2013 at 1:52 PM, mihaela olteanu <mihaela...@yahoo.com>wrote: > Hello, > > Does anyone know about some measurements in terms of performance for cross > joins compared to joins inside a single index? > > Is it faster the join inside a single index that stores all documents of > various types (from parent table or from children tables)with a > discriminator field compared to the cross join (basically in this case each > document type resides in its own index)? > > I have performed some tests but to me it seems that having a join in a > single index (bigger index) does not add too much speed improvements > compared to cross joins. > > Why a block join would be faster than a cross join if this is the case? > What are the variables that count when trying to improve the query > execution time? > > Thanks! > Mihaela -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics <http://www.griddynamics.com> <mkhlud...@griddynamics.com>