Re: Question about node failure...

2010-04-05 Thread Jonathan Ellis
On Mon, Apr 5, 2010 at 5:20 PM, Rob Coli wrote: > On 4/5/10 2:11 PM, Jonathan Ellis wrote: >> >> On Mon, Mar 29, 2010 at 6:42 PM, Tatu Saloranta >>  wrote: >>> >>> Perhaps it would be good to have convenience workflow for replacing >>> broken host ("squashing lemons")? I would assume that most com

Re: Question about node failure...

2010-04-05 Thread Rob Coli
On 4/5/10 2:11 PM, Jonathan Ellis wrote: On Mon, Mar 29, 2010 at 6:42 PM, Tatu Saloranta wrote: Perhaps it would be good to have convenience workflow for replacing broken host ("squashing lemons")? I would assume that most common use [ snip ] Does anyone have numbers on how badly "nodetool re

Re: Question about node failure...

2010-04-05 Thread Jonathan Ellis
On Mon, Mar 29, 2010 at 6:42 PM, Tatu Saloranta wrote: > Perhaps it would be good to have convenience workflow for replacing > broken host ("squashing lemons")? I would assume that most common use > case is to effectively replace host that can't be repaired (or perhaps > it might sometimes be best

Re: Question about node failure...

2010-03-29 Thread Tatu Saloranta
On Mon, Mar 29, 2010 at 10:40 AM, Ned Wolpert wrote: > So,  what does "anti-entropy repair" do then? Fix discrepancies between live nodes? (caused by transient failures presumably) > Sounds like you have to 'decommission' the dead node, then I thought run > 'nodeprobe repair' to get the data adj

Re: Question about node failure...

2010-03-29 Thread Ned Wolpert
So, what does "anti-entropy repair" do then? Sounds like you have to 'decommission' the dead node, then I thought run 'nodeprobe repair' to get the data adjusted back to a replication factor of 3, right? Also, what is the method to decommission a dead node? pass in the IP address of the dead nod

Re: Question about node failure...

2010-03-29 Thread Jonathan Ellis
On Mon, Mar 29, 2010 at 12:27 PM, Ned Wolpert wrote: > Folks- > > Can someone point out what happens during a node failure. Here is the > Specific usecase: > >   - Cassandra cluster with 4 nodes, replication factor of 3 >   - One node fails. >   - At this point, data that existed on the one failed

Question about node failure...

2010-03-29 Thread Ned Wolpert
Folks- Can someone point out what happens during a node failure. Here is the Specific usecase: - Cassandra cluster with 4 nodes, replication factor of 3 - One node fails. - At this point, data that existed on the one failed node has copies on 2 live nodes. - The failed node never comes ba