On Mon, 2016-04-11 at 11:23 +0000, Bhaumik Joshi wrote: > We are using solr 5.2.0 and we have Index-heavy (100 index updates per > sec) and Query-heavy (100 queries per sec) scenario.
> Index stats: 10 million documents and 16 GB index size > Which sharding strategy is best suited in above scenario? Sharding reduces query throughput and can improve query latency as well as indexing speed. For small indexes, the overhead of sharding is likely to worsen query latency. So as always, it depends. Qualified guess: Don't use multiple shards, but consider using replicas. > Please share reference resources which states detailed comparison of > single shard over multi shard if any. Sorry, could not find the one I had in mind. > > Meanwhile we did some tests with SolrMeter (Standalone java tool for > stress tests with Solr) for single shard and two shards. > > Index stats of test solr cloud: 0.7 million documents and 1 GB index > size. > > As observed in test average query time with 2 shards is much higher > than single shard. Makes sense: Your shards are so small that the actual time spend on the queries is very low. So relatively, the overhead of distributed (aka multi-shard) searching is high, negating any search-gain you got by sharding. I would not have expected the performance drop-off to be that large (factor 20-60) though. Your query speed is unusually low for an index of your size, which leads me to believe that your indexing is slowing everything down. This is often due to too frequent commits and/or too many warm up queries. There is a bit about it at https://wiki.apache.org/solr/SolrPerformanceFactors - Toke Eskildsen, State and University Library, Denmark