I'm doing a cross-core join query and the join query is 30X slower than
each of the 2 individual queries. Here are the queries:

Main query: http://localhost:8983/solr/mainindex/select?q=title:java
QTime: 5 msec
hit count: 1000

Sub query: http://localhost:8983/solr/subindex/select?q=+fld1:[0.1 TO 0.3]
QTime: 4 msec
hit count: 25K

Join query:
http://localhost:8983/solr/mainindex/select?q=title:java&fq={!joinfromIndex=mainindex
toIndex=subindex from=docid to=docid}fld1:[0.1 TO 0.3]
QTime: 160 msec
hit count: 205

Here are the index spec's:

mainindex size: 117K docs, 1 segment
mainindex schema:
   <field name="docid" type="int" indexed="true" stored="true"
required="true" multiValued="false" />
   <field name="title" type="text_en_splitting" indexed="true"
stored="true" multiValued="false" />
   <uniqueKey>docid</uniqueKey>

subindex size: 117K docs, 1 segment
subindex schema:
   <field name="docid" type="int" indexed="true" stored="true"
required="true" multiValued="false" />
   <field name="fld1" type="float" indexed="true" stored="true"
required="false" multiValued="false" />
   <uniqueKey>docid</uniqueKey>

With debugQuery=true I see:
  "debug":{
    "join":{
      "{!join from=docid to=docid fromIndex=subindex}fld1:[0.1 TO 0.3]":{
        "time":155,
        "fromSetSize":24742,
        "toSetSize":24742,
        "fromTermCount":117810,
        "fromTermTotalDf":117810,
        "fromTermDirectCount":117810,
        "fromTermHits":24742,
        "fromTermHitsTotalDf":24742,
        "toTermHits":24742,
        "toTermHitsTotalDf":24742,
        "toTermDirectCount":24627,
        "smallSetsDeferred":115,
        "toSetDocsAdded":24742}},

Via profiler and debugger, I see 150 msec spent in the outer
'while(term!=null)' loop in: JoinQueryWeight.getDocSet(). This seems like a
lot of time to join the bitsets. Does this seem right?

Peter

Reply via email to