To pile on here: When you denormalize you also get some functionality that you do not get with Solr joins, they've been called "pseudo joins" in Solr for a reason.
If you just use the simple approach of indexing the two tables then joining across them you can't return fields from both tables in a single document. To do that you need to use parent/child docs which has its own restrictions. So rather than worry excessively about which is faster, I'd recommend you decide on the functionality you need as a starting point. Best, Erick On Thu, Aug 31, 2017 at 7:34 AM, Walter Underwood <wun...@wunderwood.org> wrote: > There is no way tell which is faster without trying it. > > Query speed depends on the size of the data (rows), the complexity of the > join, which database, what kind of disk, etc. > > Solr speed depends on the size of the documents, the complexity of your > analysis chains, what kind of disk, how much CPU is available, etc. > > We have one query that extracts 9 million documents from MySQL in about 20 > minutes. We have another query on a different MySQL database that takes 90 > minutes to get 7 million documents. > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > > >> On Aug 31, 2017, at 12:54 AM, Renuka Srishti <renuka.srisht...@gmail.com> >> wrote: >> >> Thanks Erick, Walter >> But I think join query will reduce the performance. Denormalization will be >> the better way than join query, am I right? >> >> >> >> On Wed, Aug 30, 2017 at 10:18 PM, Walter Underwood <wun...@wunderwood.org> >> wrote: >> >>> Think about making a denormalized view, with all the fields needed in one >>> table. That view gets sent to Solr. Each row is a Solr document. >>> >>> It could be implemented as a view or as SQL, but that is a useful mental >>> model for people starting from a relational background. >>> >>> wunder >>> Walter Underwood >>> wun...@wunderwood.org >>> http://observer.wunderwood.org/ (my blog) >>> >>> >>>> On Aug 30, 2017, at 9:14 AM, Erick Erickson <erickerick...@gmail.com> >>> wrote: >>>> >>>> First, it's often best, by far, to denormalize the data in your solr >>> index, >>>> that's what I'd explore first. >>>> >>>> If you can't do that, the join query parser might work for you. >>>> >>>> On Aug 30, 2017 4:49 AM, "Renuka Srishti" <renuka.srisht...@gmail.com> >>>> wrote: >>>> >>>>> Thanks Susheel for your response. >>>>> Here is the scenario about which I am talking: >>>>> >>>>> - Let suppose there are two documents doc1 and doc2. >>>>> - I want to fetch the data from doc2 on the basis of doc1 fields which >>>>> are related to doc2. >>>>> >>>>> How to achieve this efficiently. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Renuka Srishti >>>>> >>>>> >>>>> On Mon, Aug 28, 2017 at 7:02 PM, Susheel Kumar <susheel2...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hello Renuka, >>>>>> >>>>>> I would suggest to start with your use case(s). May be start with your >>>>>> first use case with the below questions >>>>>> >>>>>> a) What is that you want to search (which fields like name, desc, city >>>>>> etc.) >>>>>> b) What is that you want to show part of search result (name, city >>> etc.) >>>>>> >>>>>> Based on above two questions, you would know what data to pull in from >>>>>> relational database and create solr schema and index the data. >>>>>> >>>>>> You may first try to denormalize / flatten the structure so that you >>> deal >>>>>> with one collection/schema and query upon it. >>>>>> >>>>>> HTH. >>>>>> >>>>>> Thanks, >>>>>> Susheel >>>>>> >>>>>> On Mon, Aug 28, 2017 at 8:04 AM, Renuka Srishti < >>>>>> renuka.srisht...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hii, >>>>>>> >>>>>>> What is the best way to index relational database, and how it impacts >>>>> on >>>>>>> the performance? >>>>>>> >>>>>>> Thanks >>>>>>> Renuka Srishti >>>>>>> >>>>>> >>>>> >>> >>> >