I suspect (not certain) one reason for the performance difference with Solr vs Lucene joins is that Solr operates on a top-level reader?
This results in fast joins, but it means whenever you open a new reader (NRT reader) there is a high cost to regenerate the top-level data structures. But if the app doesn't open NRT readers, or opens them rarely, perhaps that cost is a good tradeoff to get faster joins. Mike McCandless http://blog.mikemccandless.com On Thu, Feb 13, 2014 at 12:10 AM, anand chandak <anand.chan...@oracle.com> wrote: > Re-posting... > > > > Thanks, > > Anand > > > > On 2/12/2014 10:55 AM, anand chandak wrote: >> >> Thanks David, really helpful response. >> >> You mentioned that if we have to add scoring support in solr then a >> possible approach would be to add a custom QueryParser, which might be >> taking Lucene's JOIN module. I have tired this approach and this makes it >> slow, because I believe this is making more searches.. >> >> Curious, if it is possible instead to enhance existing solr's >> JoinQParserPlugin and add the the scoring support in the same class ? Do you >> think its feasible and recommended ? If yes, what would it take (highlevel) >> - in terms of code changes, any pointers ? >> >> >> Thanks, >> >> Anand >> >> >> On 2/12/2014 10:31 AM, David Smiley (@MITRE.org) wrote: >>> >>> Hi Anand. >>> >>> Solr's JOIN query, {!join}, constant-scores. It's simpler and faster and >>> more memory efficient (particularly the worse-case memory use) to >>> implement >>> the JOIN query without scoring, so that's why. Of course, you might want >>> it >>> to score and pay whatever penalty is involved. For that you'll need to >>> write a Solr "QueryParser" that might use Lucene's "join" module which >>> has >>> scoring variants. I've taken this approach before. You asked a specific >>> question about the purpose of JoinScorer when it doesn't actually score. >>> Lucene's "Query" produces a "Weight" which in turn produces a "Scorer" >>> that >>> is a DocIdSetIterator plus it returns a score. So Queries have to have a >>> Scorer to match any document even if the score is always 1. >>> >>> Solr does indeed have a lot of caching; that may be in play here when >>> comparing against a quick attempt at using Lucene directly. In >>> particular, >>> the matching documents are likely to end up in Solr's DocumentCache. >>> Returning stored fields that come back in search results are one of the >>> more >>> expensive things Lucene/Solr does. >>> >>> I also think you noted that the fields on documents from the "from" side >>> of >>> the query are not available to be returned in search results, just the >>> "to" >>> side. Yup; that's true. To remedy this, you might write a Solr >>> SearchComponent that adds fields from the "from" side. That could be >>> tricky >>> to do; it would probably need to re-run the from-side query but filtered >>> to >>> the matching top-N documents being returned. >>> >>> ~ David >>> >>> >>> anand chandak wrote >>>> >>>> Resending, if somebody can please respond. >>>> >>>> >>>> Thanks, >>>> >>>> Anand >>>> >>>> >>>> On 2/5/2014 6:26 PM, anand chandak wrote: >>>> Hi, >>>> >>>> Having a question on join score, why doesn't the solr join query return >>>> the scores. Looking at the code, I see there's JoinScorer defined in >>>> the JoinQParserPlugin class ? If its not used for scoring ? where is it >>>> actually used. >>>> >>>> Also, to evaluate the performance of solr join plugin vs lucene >>>> joinutil, I filed same join query against same data-set and same schema >>>> and in the results, I am always seeing the Qtime for Solr much lower >>>> then lucenes. What is the reason behind this ? Solr doesn't return >>>> scores could that cause so much difference ? >>>> >>>> My guess is solr has very sophisticated caching mechanism and that might >>>> be coming in play, is that true ? or there's difference in the way JOIN >>>> happens in the 2 approach. >>>> >>>> If I understand correctly both the implementation are using 2 pass >>>> approach - first all the terms from fromField and then returns all >>>> documents that have matching terms in a toField >>>> >>>> If somebody can throw some light, would highly appreciate. >>>> >>>> Thanks, >>>> Anand >>> >>> >>> >>> >>> ----- >>> Author: >>> http://www.packtpub.com/apache-solr-3-enterprise-search-server/book >>> -- >>> View this message in context: >>> http://lucene.472066.n3.nabble.com/Join-Scoring-tp4115539p4116818.html >>> Sent from the Solr - User mailing list archive at Nabble.com. >> >> >> >