After taking a look on what I'd wrote earlier, I will try to rephrase in a clear manner.
It seems that sharding my collection to many shards slowed down unreasonably, and I'm trying to investigate why. First, I created "collection1" - 4 shards*replicationFactor=1 collection on 2 servers. Second I created "collection2" - 48 shards*replicationFactor=2 collection on 24 servers, keeping same config and same num of documents per shard. Observations showed the following: 1. Total qTime for the same query set is 5 time higher in collection2 (150ms->700 ms) 2. Adding to colleciton2 the *shard.info=true* param in the query shows that each shard is much slower than each shard was in collection1 (about 4 times slower) 3. Querying only specific shards on collection2 (by adding the shards=shard1,shard2...shard12 param) gave me much better qTime per shard (only 2 times higher than in collection1) 4. I have a low qps rate, thus i don't suspect the replication factor for being the major cause of this. 5. The avg. cpu load on servers during querying was much higher in collection1 than in collection2 and i didn't catch any other bottlekneck. Q: 1. Why does the amount of shards affect the qTime of each shard? 2. How can I overcome to reduce back the qTime of each shard? Thanks, Manu