After taking a look on what I'd wrote earlier, I will try to rephrase in a
clear manner.

It seems that sharding my collection to many shards slowed down
unreasonably, and I'm trying to investigate why.

First, I created "collection1" - 4 shards*replicationFactor=1 collection on
2 servers. Second I created "collection2" - 48 shards*replicationFactor=2
collection on 24 servers, keeping same config and same num of documents per
shard.
Observations showed the following:

   1. Total qTime for the same query set is 5 time higher in collection2
   (150ms->700 ms)
   2. Adding to colleciton2 the *shard.info=true* param in the query shows
   that each shard is much slower than each shard was in collection1 (about 4
   times slower)
   3.  Querying only specific shards on collection2 (by adding the
   shards=shard1,shard2...shard12 param) gave me much better qTime per shard
   (only 2 times higher than in collection1)
   4. I have a low qps rate, thus i don't suspect the replication factor
   for being the major cause of this.
   5. The avg. cpu load on servers during querying was much higher in
   collection1 than in collection2 and i didn't catch any other bottlekneck.

Q:
1. Why does the amount of shards affect the qTime of each shard?
2. How can I overcome to reduce back the qTime of each shard?

Thanks,
Manu

Reply via email to