Brent <brent.pear...@gmail.com> wrote: > I've been testing Solr Cloud 6.1.0 with two servers, and getting somewhat > disappointing query latency. I'm comparing the latency with the same tests, > running DSE in place of Solr Cloud. It's surprising, because running the > test just on my laptop (running a single instance of Solr), I get > significantly better latency with Solr than with DSE.
I can understand why it is surprising, but fortunately there is a simple explanation: > In theory, shouldn't 2 nodes running Solr be the fastest? Depends on setup, corpus & queries. My own rule of thumb: Only shard if you are really sure it will help. > When running Solr with just one node, I create the collection with 1 shard. > When running Solr with both nodes, I create the collection with 2 shards. The number of shards is the reason. With 1 shard, 1 request will be processed by 1 core. With 2 shards, 1 request will be processed by 2 cores: If the outside request is directed at core A, core A will send 1 request to core B, find the top merged hits from both cores, then send another request to core B to resolve the documents for the hits from core B (there might be some optimization that does away with the second call in some cases, but it only mitigates the problem). As you see there is an overhead for using 2+ shards. For smaller setups, that overhead can easily overshadow the gains from the extra hardware power introduced by sharding. If you want to improve your query performance then use 1 shard with replication. - Toke Eskildsen