Re: Replicating to all nodes

2011-07-15 Thread Peter Schuller
> I am worried that if only 1 node is active and online, and the other > N-1 nodes are inactive, down, and offline, that the cluster will not > be able to complete the operation, because not all of the data is > available on the 1 node that is up. Which is true, but the correct way normally is to

Re: Replicating to all nodes

2011-07-15 Thread Kyle Gibson
> The node (known as the "coordinating node" because it co-ordinates the > request submitted by the client) will send the request to the nodes > that are in the replica set for the row. The client need not care > about which host it connects to, other than that it be "one of the > ones in the corre

Re: Replicating to all nodes

2011-07-15 Thread Peter Schuller
> I understand that CL.ONE means the read operation will block until at > least one -replica- responds. If this node is not a replica, what > happens? The node (known as the "coordinating node" because it co-ordinates the request submitted by the client) will send the request to the nodes that are

Re: Replicating to all nodes

2011-07-15 Thread Kyle Gibson
> No. I am not entirely sure from where the confusion comes, so I will > just try to summarize things from scratch in a brief manner. > > Any piece of data you store in Cassandra is going to be in a > particular row, which has a row key. > > That row will have a "replica set" in the Cassandra clust

Re: Replicating to all nodes

2011-07-15 Thread Peter Schuller
> I was/am under the impression that a node owns a particular token > range, and does not save any data that falls outside of that range > (with exception to any data that might be replicated to it). Based on > what you are saying, each node owns a token range, but also maintains > copies of data o

Re: Replicating to all nodes

2011-07-15 Thread Kyle Gibson
So my understanding of how cassandra saves data is incorrect. I was/am under the impression that a node owns a particular token range, and does not save any data that falls outside of that range (with exception to any data that might be replicated to it). Based on what you are saying, each node ow

Re: Replicating to all nodes

2011-07-15 Thread Peter Schuller
> The goal is to configure a cluster in which reads and writes can > complete successfully even if only 1 node is online. For this to work, Why? You should be designing for "only 1 out of N nodes" where N is RF. If you happen to have 3 machines now and you want 3 copies in total that's fine. But w

Re: Replicating to all nodes

2011-07-13 Thread Maki Watanabe
Consistency and Availability are in trade-off each other. If you use RF=7 + CL=ONE, your read/write will success if you have one node alive during replicate data to 7 nodes. Of course you will have a chance to read old data in this case. If you need strong consistency, you must use CL=QUORUM. maki

Re: Replicating to all nodes

2011-07-13 Thread Kyle Gibson
Thanks for the reply Peter. The goal is to configure a cluster in which reads and writes can complete successfully even if only 1 node is online. For this to work, each node would need the entire dataset. Your example of a 3 node ring with RF=3 would satisfy this requirement. However, if two nodes

Re: Replicating to all nodes

2011-07-13 Thread Peter Schuller
> Read and write operations should succeed even if only 1 node is online. > > When a read is performed, it is performed against all active nodes. Using QUORUM is the closest thing you get for reads without modifying Cassandra. You can't make it wait for all nodes that happen to be up. > When a wr

Replicating to all nodes

2011-07-13 Thread Kyle Gibson
I am wondering if the following cluster figuration is possible with cassandra, and if so, how it could be achieved. Please also feel free to point out any issues that may make this configuration undesired that I may not have thought of. Suppose a cluster of N nodes. Each node replicates the data