Query Regarding SOLR cross collection join

Doss Wed, 22 Jan 2020 05:27:53 -0800

HI,

SOLR version 8.3.1 (10 nodes), zookeeper ensemble (3 nodes)


One of our use cases requires joins, we are joining 2 large indexes. As
required by SOLR one index (2GB) has one shared and 10 replicas and the
other has 10 shard (40GB / Shard).

The query takes too much time, some times in minutes how can we improve
this?

Debug query produces one or more based on the number of shards (i believe)

        "time":303442,
        "fromSetSize":0,
        "toSetSize":81653955,
        "fromTermCount":0,
        "fromTermTotalDf":0,
        "fromTermDirectCount":0,
        "fromTermHits":0,
        "fromTermHitsTotalDf":0,
        "toTermHits":0,
        "toTermHitsTotalDf":0,
        "toTermDirectCount":0,
        "smallSetsDeferred":0,
        "toSetDocsAdded":0},

here what is the  toSetSize  mean? does it read 81MB of data from the
index? how can we reduce this?

Read somewhere that the score join parser will be faster, but for me it
produces no results. I am using string type fields for from and to.


Thanks!

Query Regarding SOLR cross collection join

Reply via email to