Re: vnodes: high availability

2018-01-15 Thread kurt greaves
Yeah it's very unlikely that you will have 2 nodes in the cluster with NO intersecting token ranges (vnodes) for an RF of 3 (probably even 2). If node A goes down all 256 ranges will go down, and considering there are only 49 other nodes all with 256 vnodes each, it's very likely that every node w

Re: vnodes: high availability

2018-01-15 Thread Kyrylo Lebediev
Thanks Alexander! I'm not a MS in math too) Unfortunately. Not sure, but it seems to me that probability of 2/49 in your explanation doesn't take into account that vnodes endpoints are almost evenly distributed across all nodes (al least it's what I can see from "nodetool ring" output). htt

Re: Even after the drop table, the data actually was not erased.

2018-01-15 Thread Alain RODRIGUEZ
> > As you said, the auto_bootstrap setting was turned on. Well I was talking about the 'auto_snapshot' ;-). I understand that's what you meant to say. This command seems to apply only to one node. Can it be applied > cluster-wide? Or should I run this command on each node? Indeed, 'nodetool c

Re: vnodes: high availability

2018-01-15 Thread Alexander Dejanovski
I was corrected off list that the odds of losing data when 2 nodes are down isn't dependent on the number of vnodes, but only on the number of nodes. The more vnodes, the smaller the chunks of data you may lose, and vice versa. I officially suck at statistics, as expected :) Le lun. 15 janv. 2018

Re: vnodes: high availability

2018-01-15 Thread Alexander Dejanovski
Hi Kyrylo, the situation is a bit more nuanced than shown by the Datastax diagram, which is fairly theoretical. If you're using SimpleStrategy, there is no rack awareness. Since vnode distribution is purely random, and the replica for a vnode will be placed on the node that owns the next vnode in

Re: vnodes: high availability

2018-01-15 Thread Kyrylo Lebediev
Thanks, Rahul. But in your example, at the same time loss of Node3 and Node6 leads to loss of ranges N, C, J at consistency level QUORUM. As far as I understand in case vnodes > N_nodes_in_cluster and endpoint_snitch=SimpleSnitch, since: 1) "secondary" replicas are placed on two nodes 'next'

Re: vnodes: high availability

2018-01-15 Thread Rahul Neelakantan
Not necessarily. It depends on how the token ranges for the vNodes are assigned to them. For example take a look at this diagram http://docs.datastax.com/en/archived/cassandra/2.0/cassandra/architecture/architectureDataDistributeDistribute_c.html In the vNode part of the diagram, you will see that

vnodes: high availability

2018-01-15 Thread Kyrylo Lebediev
Hi, Let's say we have a C* cluster with following parameters: - 50 nodes in the cluster - RF=3 - vnodes=256 per node - CL for some queries = QUORUM - endpoint_snitch = SimpleSnitch Is it correct that 2 any nodes down will cause unavailability of a keyrange at CL=QUORUM? Regards, K

Re: Cleanup blocking snapshots - Options?

2018-01-15 Thread Nicolas Guyomar
Hi, It might really be a long shot, but I thought UserDefinedCompaction triggered by JMX on a single sstable might remove data the node does not own (to answer your " *Any other way to re-write SSTables with data a node owns after a cluster scale out" *question part) I might be wrong though On