On Mon, 2016-04-11 at 11:23 +0000, Bhaumik Joshi wrote:
> We are using solr 5.2.0 and we have Index-heavy (100 index updates per
> sec) and Query-heavy (100 queries per sec) scenario.

> Index stats: 10 million documents and 16 GB index size

> Which sharding strategy is best suited in above scenario?

Sharding reduces query throughput and can improve query latency as well
as indexing speed. For small indexes, the overhead of sharding is likely
to worsen query latency. So as always, it depends.

Qualified guess: Don't use multiple shards, but consider using replicas.

> Please share reference resources which states detailed comparison of
> single shard over multi shard if any.

Sorry, could not find the one I had in mind.
> 
> Meanwhile we did some tests with SolrMeter (Standalone java tool for
> stress tests with Solr) for single shard and two shards.
> 
> Index stats of test solr cloud: 0.7 million documents and 1 GB index
> size.
> 
> As observed in test average query time with 2 shards is much higher
> than single shard.

Makes sense: Your shards are so small that the actual time spend on the
queries is very low. So relatively, the overhead of distributed (aka
multi-shard) searching is high, negating any search-gain you got by
sharding. I would not have expected the performance drop-off to be that
large (factor 20-60) though.

Your query speed is unusually low for an index of your size, which leads
me to believe that your indexing is slowing everything down. This is
often due to too frequent commits and/or too many warm up queries.

There is a bit about it at 
https://wiki.apache.org/solr/SolrPerformanceFactors


- Toke Eskildsen, State and University Library, Denmark



Reply via email to