I have a Solr cloud setup (Solr 7.4) with a collection "test" having two
shards on two different nodes. There are 4M records equally distributed
across the shards.

If I query the collection like below, it is slow.
http://localhost:8983/solr/*test*/select?q=*:*&rows=100000
QTime: 6930

If I query a particular shard like below, it is also slow.
http://localhost:8983/solr/*test_shard1_replica_n2*
/select?q=*:*&rows=100000&shards=*shard2*
QTime: 5494
*Notice shard2 in shards parameter and shard1 in the core being queried.*

But this is faster:
http://localhost:8983/solr/*test_shard1_replica_n2*
/select?q=*:*&rows=100000&shards=*shard1*
QTime: 57

This is also faster:
http://localhost:8983/solr/*test_shard2_replica_n4*
/select?q=*:*&rows=100000&shards=*shard2*
QTime: 71

I don't think it is the network as I performed similar tests with a single
node setup as well. If you query a particular core and the corresponding
logical shard, it is much faster than querying a different shard or core.

Why is this behaviour? How to make the first two queries work as fast as
the last two queries?

Reply via email to