Re: gossip marking all nodes as down when decommissioning one node.

2013-11-22 Thread Ryan Fowler
You might be running into CASSANDRA-6244. That ended up being our problem anyway. On Thu, Nov 21, 2013 at 9:37 PM, Robert Coli wrote: > On Thu, Nov 21, 2013 at 6:17 PM, Alain RODRIGUEZ wrote: > >> Oh ! Thanks. >> >> Is there any workaround to avoid the problem while waiting for update ? >> > > P

Re: gossip marking all nodes as down when decommissioning one node.

2013-11-21 Thread Robert Coli
On Thu, Nov 21, 2013 at 6:17 PM, Alain RODRIGUEZ wrote: > Oh ! Thanks. > > Is there any workaround to avoid the problem while waiting for update ? > Per driftx in #cassandra, this is probably *not* 6297 because only a single flush is involved. If you haven't, I would consider filing a Cassandra

Re: gossip marking all nodes as down when decommissioning one node.

2013-11-21 Thread Tupshin Harper
Increasing the phi value to 12 can be a partial workaround. It's certainly not a fix, but it does partially alleviate the issue. Otherwise hang in there until 1.2.12. Aaron is probably right that this is aggravated on under powered nodes, but larger nodes can still see these symptoms. -Tupshin On

Re: gossip marking all nodes as down when decommissioning one node.

2013-11-21 Thread Alain RODRIGUEZ
Oh ! Thanks. Is there any workaround to avoid the problem while waiting for update ? 2013/11/22 Robert Coli > On Thu, Nov 21, 2013 at 2:39 AM, Alain RODRIGUEZ wrote: > >> I just experimented the same thing on our 28 m1.xlarge C*1.2.11 cluster. >> >> phi_convict_threshold is default : 8. I wi

Re: gossip marking all nodes as down when decommissioning one node.

2013-11-21 Thread Robert Coli
On Thu, Nov 21, 2013 at 2:39 AM, Alain RODRIGUEZ wrote: > I just experimented the same thing on our 28 m1.xlarge C*1.2.11 cluster. > > phi_convict_threshold is default : 8. I will try increasing it to 12 as 12 > seems to be the good value :) > > That's still weird to see all nodes marked down at

Re: gossip marking all nodes as down when decommissioning one node.

2013-11-21 Thread Alain RODRIGUEZ
I just experimented the same thing on our 28 m1.xlarge C*1.2.11 cluster. phi_convict_threshold is default : 8. I will try increasing it to 12 as 12 seems to be the good value :) That's still weird to see all nodes marked down at once. I never experimented this before using vnodes... Alain

Re: gossip marking all nodes as down when decommissioning one node.

2013-10-28 Thread Aaron Morton
> (2 nodes in each availability zone) How many AZ’s ? > The ec2 instances are m1.large I strongly recommend using m1.xlarge with ephemeral disks or a higher spec machine. m1.large is not up to the task. > Why on earth is the decommissioning of one node causing all the nodes to be > marked d

gossip marking all nodes as down when decommissioning one node.

2013-10-25 Thread John Pyeatt
We are running a 6-node cluster in amazon cloud (2 nodes in each availability zone). The ec2 instances are m1.large and we have 256 vnodes on each node. We are using Ec2Snitch, NetworkTopologyStrategy and a replication factor of 3. When we decommission one node suddenly reads and writes start to