Re: Cluster not recovering when a single node dies

2011-05-27 Thread Paul Loy
Sounds reasonable. Thanks. On Fri, May 27, 2011 at 7:12 PM, Jonathan Ellis wrote: > It does not. (Most failures are transient, so Cassandra doesn't > inflict the non-negligible performance impact of re-replicating a full > node's worth of data until you tell it "that guys' not coming back > thi

Re: Cluster not recovering when a single node dies

2011-05-27 Thread Jonathan Ellis
It does not. (Most failures are transient, so Cassandra doesn't inflict the non-negligible performance impact of re-replicating a full node's worth of data until you tell it "that guys' not coming back this time.") On Fri, May 27, 2011 at 10:47 AM, Paul Loy wrote: > I guess my next question is: t

Re: Cluster not recovering when a single node dies

2011-05-27 Thread Paul Loy
I guess my next question is: the data should be complete somewhere in the ring with RF = 2. Does cassandra not redistribute the replication ring without a nodetool decommission call? On Fri, May 27, 2011 at 4:45 PM, Paul Loy wrote: > ahh, thanks. > > On Fri, May 27, 2011 at 4:43 PM, Jonathan Ell

Re: Cluster not recovering when a single node dies

2011-05-27 Thread Paul Loy
ahh, thanks. On Fri, May 27, 2011 at 4:43 PM, Jonathan Ellis wrote: > Quorum of 2 is 2. You need at least RF=3 for quorum to tolerate losing > a node indefinitely. > > On Fri, May 27, 2011 at 10:37 AM, Paul Loy wrote: > > We have a 4 node cluster with a replication factor of 2. When one node >

Re: Cluster not recovering when a single node dies

2011-05-27 Thread Jonathan Ellis
Quorum of 2 is 2. You need at least RF=3 for quorum to tolerate losing a node indefinitely. On Fri, May 27, 2011 at 10:37 AM, Paul Loy wrote: > We have a 4 node cluster with a replication factor of 2. When one node dies, > the other nodes throw UnavailableExceptions for quorum reads (as expected

Cluster not recovering when a single node dies

2011-05-27 Thread Paul Loy
We have a 4 node cluster with a replication factor of 2. When one node dies, the other nodes throw UnavailableExceptions for quorum reads (as expected initially). They never get out of that state. Is there something we can do in nodetool to make the remaining nodes function? Thanks. --