to add to Ericks point:

It's also highly dependent on the types of queries you expect (sorting,
faceting, fq, q, size of documents) and how many concurrent updates you
expect. If most queries are going to be similar and you are not going to be
updating very often, you can expect most of your index to be loaded into
page cache and lots of your queries to loaded from doc or query cache
(especially if you can optimize your fq to be similar vs using q and which
introduces scoring overhead). Adding more replicas will help distribute the
load. Adding shards will allow you to parallelize things but add some
memory and latency overhead because results still need to be merged. If
your shards are across multiple machine you now introduce network latency.
I've seen good success with using many shards in the same jvm but this is
with collections with billions of documents.

On Tue, Nov 17, 2015 at 9:07 PM Erick Erickson <erickerick...@gmail.com>
wrote:

> I wouldn't bother to shard either. YMMV of course, but 2.2M documents
> is actually a pretty small number unless the docs themselves are huge.
> Sharding introduces inevitable overhead, so it's usually the last
> thing you resort to.
>
> As far as the number of replicas is concerned, that's strictly a
> function of what QPS you need. Let's say you do not shard and have a
> query rate of 20 queries-per-second. If you need to support 100 QPS,
> just add 4 more replicas, this can be done any time.
>
> Best,
> Erick
>
> On Tue, Nov 17, 2015 at 3:38 PM, Markus Jelsma
> <markus.jel...@openindex.io> wrote:
> > Hi - we use the Siege load testing program. It can take a seed list of
> URL's, taken from actual user input, and can put load in parallel. It won't
> reuse common queries unless you prepare your seed list appropriately. If
> your setup achieves the goal your client anticipates, then you are fine.
> Siege is not a good tool to test extreme QPS due to obvious single machine
> and network limitations.
> >
> > Assuming your JVM heap settings and Solr cache settings are optimal, and
> your only question is how many shards, then increase the number of shards.
> Oversharding can be beneficial because more threads process less data.
> Every single core search is single threaded, so oversharding on the same
> hardware makes sense, and it seems to pay off.
> >
> > Make sure you run multiple long stress tests and restart JVM's in
> between because a) query times and load tend to regress to the mean and b)
> because HotSpot needs to 'warm up' so short tests make less sense.
> >
> > M.
> >
> >
> >
> > -----Original message-----
> >> From:Aswath Srinivasan (TMS) <aswath.sriniva...@toyota.com>
> >> Sent: Tuesday 17th November 2015 23:46
> >> To: solr-user@lucene.apache.org
> >> Subject: Performance testing on SOLR cloud
> >>
> >> Hi fellow developers,
> >>
> >> Please share your experience, on how you did performance testing on
> SOLR? What I'm trying to do is have SOLR cloud on 3 Linux servers with 16
> GB RAM and index a total of 2.2 million. Yet to decide how many shards and
> replicas to have (Any hint on this is welcome too, basically 'only'
> performance testing, so suggest the number of shards and replicas if you
> can). Ultimately, I'm trying to find the QPS that this SOLR cloud set up
> can handle.
> >>
> >> To summarize,
> >>
> >> 1.   Find the QPS that my solr cloud set up can support
> >>
> >> 2.   Using 5.3.1 version with external zookeeper
> >>
> >> 3.   3 Linux servers with 16 GB RAM and index a total of 2.2 million
> documents
> >>
> >> 4.   Yet to decide number of shards and replicas
> >>
> >> 5.   Not using any custom search application (performance testing for
> SOLR and not for Search portal)
> >>
> >> Thank you
> >>
>

Reply via email to