Brent <brent.pear...@gmail.com> wrote:
> I've been testing Solr Cloud 6.1.0 with two servers, and getting somewhat
> disappointing query latency. I'm comparing the latency with the same tests,
> running DSE in place of Solr Cloud. It's surprising, because running the
> test just on my laptop (running a single instance of Solr), I get
> significantly better latency with Solr than with DSE.

I can understand why it is surprising, but fortunately there is a simple 
explanation:

> In theory, shouldn't 2 nodes running Solr be the fastest?

Depends on setup, corpus & queries. My own rule of thumb: Only shard if you are 
really sure it will help.

> When running Solr with just one node, I create the collection with 1 shard.
> When running Solr with both nodes, I create the collection with 2 shards.

The number of shards is the reason. With 1 shard, 1 request will be processed 
by 1 core.

With 2 shards, 1 request will be processed by 2 cores: If the outside request 
is directed at core A, core A will send 1 request to core B, find the top 
merged hits from both cores, then send another request to core B to resolve the 
documents for the hits from core B (there might be some optimization that does 
away with the second call in some cases, but it only mitigates the problem).

As you see there is an overhead for using 2+ shards. For smaller setups, that 
overhead can easily overshadow the gains from the extra hardware power 
introduced by sharding.

If you want to improve your query performance then use 1 shard with replication.

- Toke Eskildsen

Reply via email to