I've never reviewed that join query debug info - very interesting. Break it all down - do the universe: query directly. Then see what the results are and manually construct a query using the results of the join query. If none of that works, try either a profiler to see where Solr is spending most of its time, or try connecting a debugger (with source code) such as Eclipse, to Solr and see where it is spending most of its time.
Just some thoughts. Upayavira On Wed, Sep 9, 2015, at 04:24 PM, Russell Taylor wrote: > Hi Upayavira, > Here are a couple examples with debugQuery set. > I've mislead Mikhail as the query times are getting longer as the list of > ids gets bigger. > > Can you see a reason why where indexB has only 6 id's in its list it > still takes 46 seconds? > > Ids=6 > { > "responseHeader": { > "status": 0, > "QTime": 46849, > "params": { > "debugQuery": "true", > "indent": "true", > "start": "0", > "q": "{!join from=sedolKey to=sedolKey > fromIndex=indexB}universe:55AL86", > "_": "1441810031117", > "wt": "json" > } > }, > "response": { > "numFound": 0, > "start": 0, > "docs": [] > }, > "debug": { > "join": { > "{!join from=sedolKey to=sedolKey > fromIndex=indexB}universe:55AL86": { > "time": 46848, > "fromSetSize": 6, > "toSetSize": 0, > "fromTermCount": 11837043, > "fromTermTotalDf": 11837043, > "fromTermDirectCount": 11837043, > "fromTermHits": 6, > "fromTermHitsTotalDf": 6, > "toTermHits": 0, > "toTermHitsTotalDf": 0, > "toTermDirectCount": 0, > "smallSetsDeferred": 0, > "toSetDocsAdded": 0 > } > }, > "rawquerystring": "{!join from=longValue to=longValue > fromIndex=indexB}universe:55AL86", > "querystring": "{!join from=longValue to=longValue > fromIndex=indexB}universe:55AL86", > "parsedquery": "JoinQuery({!join from=longValue to=longValue > fromIndex=indexB}universe:55AL86)", > "parsedquery_toString": "{!join from=longValue to=longValue > fromIndex=indexB}universe:55AL86", > "explain": {}, > "QParser": "", > "timing": { > "time": 46849, > "prepare": { > "time": 0, > "query": { > "time": 0 > }, > "facet": { > "time": 0 > }, > "mlt": { > "time": 0 > }, > "highlight": { > "time": 0 > }, > "stats": { > "time": 0 > }, > "expand": { > "time": 0 > }, > "debug": { > "time": 0 > } > }, > "process": { > "time": 46848, > "query": { > "time": 46848 > }, > "facet": { > "time": 0 > }, > "mlt": { > "time": 0 > }, > "highlight": { > "time": 0 > }, > "stats": { > "time": 0 > }, > "expand": { > "time": 0 > }, > "debug": { > "time": 0 > } > } > } > } > } > > ########################################### > Ids=298 > { > "responseHeader": { > "status": 0, > "QTime": 51570, > "params": { > "debugQuery": "true", > "indent": "true", > "q": "{!join from=longValue to=longValue > fromIndex=indexB}universe:16XO52", > "_": "1441810442921", > "wt": "json" > } > }, > "response": { > "numFound": 0, > "start": 0, > "docs": [] > }, > "debug": { > "join": { > "{!join from=longValue to=longValue > fromIndex=indexB}universe:16XO52": { > "time": 51570, > "fromSetSize": 298, > "toSetSize": 0, > "fromTermCount": 11837043, > "fromTermTotalDf": 11837043, > "fromTermDirectCount": 11837043, > "fromTermHits": 298, > "fromTermHitsTotalDf": 298, > "toTermHits": 0, > "toTermHitsTotalDf": 0, > "toTermDirectCount": 0, > "smallSetsDeferred": 0, > "toSetDocsAdded": 0 > } > }, > "rawquerystring": "{!join from=longValue to=longValue > fromIndex=indexB}universe:16XO52", > "querystring": "{!join from=longValue to=longValue > fromIndex=indexB}universe:16XO52", > "parsedquery": "JoinQuery({!join from=longValue to=longValue > fromIndex=indexB}universe:16XO52)", > "parsedquery_toString": "{!join from=longValue to=longValue > fromIndex=indexB}universe:16XO52", > "explain": {}, > "QParser": "", > "timing": { > "time": 51570, > "prepare": { > "time": 0, > "query": { > "time": 0 > }, > "facet": { > "time": 0 > }, > "mlt": { > "time": 0 > }, > "highlight": { > "time": 0 > }, > "stats": { > "time": 0 > }, > "expand": { > "time": 0 > }, > "debug": { > "time": 0 > } > }, > "process": { > "time": 51570, > "query": { > "time": 51570 > }, > "facet": { > "time": 0 > }, > "mlt": { > "time": 0 > }, > "highlight": { > "time": 0 > }, > "stats": { > "time": 0 > }, > "expand": { > "time": 0 > }, > "debug": { > "time": 0 > } > } > } > } > } > > ################################################################### > Id's = 424088 > "debug": { > "join": { > "{!join from=longValue to=longValue > fromIndex=indexB}universe:LARGE": { > "time": 44386, > "fromSetSize": 424088, > "toSetSize": 892314, > "fromTermCount": 11837043, > "fromTermTotalDf": 11837043, > "fromTermDirectCount": 11837043, > "fromTermHits": 420365, > "fromTermHitsTotalDf": 420365, > "toTermHits": 57074, > "toTermHitsTotalDf": 944597, > "toTermDirectCount": 55722, > "smallSetsDeferred": 1352, > "toSetDocsAdded": 892314 > } > }, > "rawquerystring": "{!join from=longValue to=longValue > fromIndex=indexB}universe:LARGE", > "querystring": "{!join from=longValue to=longValue > fromIndex=indexB}universe:LARGE", > "parsedquery": "JoinQuery({!join from=longValue to=longValue > fromIndex=indexB}universe:LARGE)", > "parsedquery_toString": "{!join from=longValue to=longValue > fromIndex=indexB}universe:LARGE", > "explain": { > "2000000076769983": "\n1.0 = (MATCH) > org.apache.solr.search.JoinQuery$JoinQueryWeight@483489aa , product > of:\n 1.0 = boost\n 1.0 = queryNorm\n", > "2000000076769984": "\n1.0 = (MATCH) > org.apache.solr.search.JoinQuery$JoinQueryWeight@118f9aef , product > of:\n 1.0 = boost\n 1.0 = queryNorm\n", > "2000000076769985": "\n1.0 = (MATCH) > org.apache.solr.search.JoinQuery$JoinQueryWeight@7a5d18ac , product > of:\n 1.0 = boost\n 1.0 = queryNorm\n", > "2000000076769986": "\n1.0 = (MATCH) > org.apache.solr.search.JoinQuery$JoinQueryWeight@6d601adc , product > of:\n 1.0 = boost\n 1.0 = queryNorm\n", > "2000000076769987": "\n1.0 = (MATCH) > org.apache.solr.search.JoinQuery$JoinQueryWeight@2e262f31 , product > of:\n 1.0 = boost\n 1.0 = queryNorm\n", > "2000000076769988": "\n1.0 = (MATCH) > org.apache.solr.search.JoinQuery$JoinQueryWeight@4b200302 , product > of:\n 1.0 = boost\n 1.0 = queryNorm\n", > "2000000076769989": "\n1.0 = (MATCH) > org.apache.solr.search.JoinQuery$JoinQueryWeight@569910d2 , product > of:\n 1.0 = boost\n 1.0 = queryNorm\n", > "2000000076770006": "\n1.0 = (MATCH) > org.apache.solr.search.JoinQuery$JoinQueryWeight@635a3843 , product > of:\n 1.0 = boost\n 1.0 = queryNorm\n", > "2000000076770007": "\n1.0 = (MATCH) > org.apache.solr.search.JoinQuery$JoinQueryWeight@5b438a92 , product > of:\n 1.0 = boost\n 1.0 = queryNorm\n", > "2000000076770029": "\n1.0 = (MATCH) > org.apache.solr.search.JoinQuery$JoinQueryWeight@31d9c4c , product > of:\n 1.0 = boost\n 1.0 = queryNorm\n" > }, > "QParser": "", > "timing": { > "time": 480859, > "prepare": { > "time": 0, > "query": { > "time": 0 > }, > "facet": { > "time": 0 > }, > "mlt": { > "time": 0 > }, > "highlight": { > "time": 0 > }, > "stats": { > "time": 0 > }, > "expand": { > "time": 0 > }, > "debug": { > "time": 0 > } > }, > "process": { > "time": 480859, > "query": { > "time": 43737 > }, > "facet": { > "time": 0 > }, > "mlt": { > "time": 0 > }, > "highlight": { > "time": 0 > }, > "stats": { > "time": 0 > }, > "expand": { > "time": 0 > }, > "debug": { > "time": 437122 > } > } > } > } > } > > > Thanks > > > Russ. > -----Original Message----- > From: Upayavira [mailto:u...@odoko.co.uk] > Sent: 09 September 2015 13:02 > To: solr-user@lucene.apache.org > Subject: Re: Solr Join between two indexes taking too long. > > To explain what a join does: > > It goes over to the joined index, and executes a query. This results in a > list of "ids" that will be used to do a search on the main index. The > more of these ids there are, the worse performance will be. Thus, if you > have 100k documents that match in the join core, you will be doing a 100k > term search, which will invariably be painful, because the more terms you > include in the search, the slower it will be. > > How many matching docs do you have on the other side of your query? > > Upayavira > > On Tue, Sep 8, 2015, at 02:09 PM, Russell Taylor wrote: > > Hi, > > I hope somebody can help. > > > > We have two indexes, one which holds the descriptive data and the > > other one which holds lists of docs which are of a certain type > > (called universes in our world). They need to be joined together to > > show a list of data from indexA where a filtered indexB (by > > universe:value) has matching longs (The join field). > > > > At the moment the query is taking 55 seconds we need to get it under a > > second, any help most appreciated. > > > > INDEXES: > > > > Index a (primary index) > > 31 million docs with a converted alphanumeric to a long value with a > > possible 10 million unique values. > > > > Index B (the joined index) > > 250 million documents with a converted alphanumeric to a long value > > with a possible 10 million unique values. > > IndexB is filtered by universe which could be between 1 and 500,000 docs. > > > > QUERY: > > http://127.0.0.1:8080/solr/indexA/select?q={!join+from=longValue+to=lo > > ngValue+fromIndex=IndexB}universe:<http://127.0.0.1:8080/solr/indexA/s > > elect?q=%7b!join+from=longValue+to=longValue+fromIndex=IndexB%7duniver > > se:>universeValue > > > > Qtime is 55 seconds for either a universe of 5 docs or 500,000 docs. > > > > > > > > Thanks > > > > > > Russ. > > > > > > ******************************************************* > > This message (including any files transmitted with it) may contain > > confidential and/or proprietary information, is the property of > > Interactive Data Corporation and/or its subsidiaries, and is directed > > only to the addressee(s). If you are not the designated recipient or > > have reason to believe you received this message in error, please > > delete this message from your system and notify the sender > > immediately. An unintended recipient's disclosure, copying, > > distribution, or use of this message or any attachments is prohibited and > > may be unlawful. > > ******************************************************* > > > ******************************************************* > This message (including any files transmitted with it) may contain > confidential and/or proprietary information, is the property of > Interactive Data Corporation and/or its subsidiaries, and is directed > only to the addressee(s). If you are not the designated recipient or have > reason to believe you received this message in error, please delete this > message from your system and notify the sender immediately. An unintended > recipient's disclosure, copying, distribution, or use of this message or > any attachments is prohibited and may be unlawful. > ******************************************************* >