Hello After performing a benchmark session on small scale i moved to a full scale on 16 quad core servers. Observations at small scale gave me excellent qTime (about 150 ms) with up to 2 servers, showing my searching thread was mainly cpu bounded. My query set is not faceted. Growing to full scale (with same config & schema & num of docs per shard) i sharded my collection to 48 shards and added a replication for each. Since then i have a major performance deteriotaion, my qTime went up to 700 msec. Servers have a much smaller load, and network does not show any difficulties. I understand that the response merging and waiting for the slowest shard response should increase my small scale qTime, so checked shard.info=true to observe that each shard was taking much longer, while defining query for specific shards (shards=shard1,shard2...shard12) i get much better results for each shard qTime and total qTime.
Keeping the same config, how come the num of shards affects the qTime of each shard? How can i evercome this issue? Thanks, Manu