Yeah, sorry, my maths was clearly flawed today, thanks for correcting me
Shawn.

What I meant was in a 3 ZK setup, if you lose one machine, you are okay,
but you are also "at risk", since losing anything else would lose quorum.
So in our NRT-style scenario, we would have to get that dead machine back
ASAP.

As Shawn says, we have a larger ensemble to allow for another machine
crashing during a planned maintenance window (so we are down 2 ZKs for some
period of time, and that is still ok).

It all depends how DR you need to be.

On 13 April 2016 at 16:48, Shawn Heisey <apa...@elyograg.org> wrote:

> On 4/13/2016 9:34 AM, Daniel Collins wrote:
> > Just to chip in, more ZKs are probably only necessary if you are doing
> NRT
> > indexing.
> >
> > Loss of a single ZK (in a 3 machine setup) will block indexing for the
> time
> > it takes to get that machine/instance back up
>
> That would NOT block indexing.  If you have three zookeepers and you
> lose one, SolrCloud functionality will not change.  If you lose TWO,
> then you would no longer be able to index.
>
> If you've seen a situation where losing one zookeeper out of three
> causes indexing to stop, then either something is not configured
> correctly, or you've encountered a bug.  I would bet more on a
> misconfiguration than a bug.
>
> A 5-node ensemble would allow you to lose a server and still be able to
> take down another server for maintenance, without affecting SolrCloud
> operation.
>
> Thanks,
> Shawn
>
>

Reply via email to