Just to chip in, more ZKs are probably only necessary if you are doing NRT
indexing.

Loss of a single ZK (in a 3 machine setup) will block indexing for the time
it takes to get that machine/instance back up, however it will have less
impact on search, since the search side can use the existing state of the
cloud to work.  If you only index once a day, then that's fine, but in our
scenario, we continually index all day long, so we can't afford a "break".
Hence we actually run 7 ZKs currently though we plan to go down to 5.  That
gives us the ability to lose 2 machines without affecting indexing.

But as Erick says, for "normal" scenarios, where search load is much
greater than indexing load, 3 should be sufficient.


On 13 April 2016 at 15:27, Erick Erickson <erickerick...@gmail.com> wrote:

> bq: or is it dependent on query load and performance sla's
>
> Exactly. The critical bit is that every single replica meets your SLA.
> By that I mean let's claim that your SLA is 500ms. If you can
> serve 10 qps at that SLA with one replica/shard (i.e. leader only)
> you can server 50 QPS by adding 4 more replicas.
>
> What you _cannot_ do is reduce the 500ms response time by
> adding more replicas. You'll need to add more shards, which probably
> means re-indexing. Which is why I recommend pushing a test system
> to destruction before deciding on the final numbers.
>
> And having at least 2 replicas shard (leader and replica) is usually
> a very good thing because Solr will stop serving queries or indexing
> if all the replicas for any shard are down.
>
> Best,
> Erick
>
> On Wed, Apr 13, 2016 at 7:19 AM, Jay Potharaju <jspothar...@gmail.com>
> wrote:
> > Thanks for the feedback Eric.
> > I am assuming the number of replicas help in load balancing and
> reliability. That being said are there any recommendation for that, or is
> it dependent on query load and performance sla's.
> >
> > Any suggestions on aws setup?
> > Thanks
> >
> >
> >> On Apr 13, 2016, at 7:12 AM, Erick Erickson <erickerick...@gmail.com>
> wrote:
> >>
> >> For collections with this few nodes, 3 zookeepers are plenty. From
> >> what I've seen people don't go to 5 zookeepers until they have
> >> hundreds and hundreds of nodes.
> >>
> >> 100M docs can fit on 2 shards, I've actually seen many more. That
> >> said, if the docs are very large and/or the searchers are complex
> >> performance may not be what you need. Here's a long blog on
> >> testing a configuration to destruction to be _sure_ you can scale
> >> as you need:
> >>
> >>
> https://lucidworks.com/blog/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
> >>
> >> Best,
> >> Erick
> >>
> >>> On Wed, Apr 13, 2016 at 6:47 AM, Jay Potharaju <jspothar...@gmail.com>
> wrote:
> >>> Hi,
> >>>
> >>> In my current setup I have about 30 million docs which will grow to 100
> >>> million by the end of the year. In order to accommodate scaling and
> query
> >>> load, i am planning to have atleast 2 shards and 2/3 replicas to begin
> >>> with. With the above solrcloud setup I plan to have 3 zookeepers in the
> >>> quorum.
> >>>
> >>> If the number of replicas and shards increases, the number of solr
> >>> instances will also go up. With keeping that in mind I was wondering if
> >>> there are any guidelines on the number of zk instances to solr
> instances.
> >>>
> >>> Secondly are there any recommendations for setting up solr in AWS?
> >>>
> >>> --
> >>> Thanks
> >>> Jay
>

Reply via email to