RE: New keyspace loading creates 'master'
I can confirm that not all nodes receive schema, not only new, but some old also, while old one were dead on time of keyspace creation and upon return have not received an update. Sometime even all nodes running fine, not all have received schema update. Viktor -Original Message- From: Gary Dusbabek [mailto:gdusba...@gmail.com] Sent: Friday, June 11, 2010 2:58 AM To: dev@cassandra.apache.org Subject: Re: New keyspace loading creates 'master' On Thu, Jun 10, 2010 at 17:16, Ronald Park wrote: > What we found was that, if the node on which we originally installed the > keyspace was down when a new node is added, the new node does not get the > keyspace schema. In some regards, it is now the 'master', at least in > distributing the keyspace data. Is this a known limitation? > To clarify, are you saying that on a cluster of N nodes, if the original node was down and there are N-1 live nodes, that new nodes will not receive keyspace definitions? If so, it's less of a limitation and more of a bug. Gary. > Thanks, > Ron >
Re: New keyspace loading creates 'master'
I've filed this as https://issues.apache.org/jira/browse/CASSANDRA-1182. I've created steps to reproduce based on your email and placed them in the ticket description. Can you confirm that I've described things correctly? Gary. On Thu, Jun 10, 2010 at 17:16, Ronald Park wrote: > Hi, > > We've been fiddling around with a small Cassandra cluster, bringing nodes up > and down, to get a feel for how things are replicated and how spinning up a > new node works (before having to learn it live :). We are using the trunk > because we want to use the expiration feature. Along with that comes the > new keyspace api to load the 'schema'. > > What we found was that, if the node on which we originally installed the > keyspace was down when a new node is added, the new node does not get the > keyspace schema. In some regards, it is now the 'master', at least in > distributing the keyspace data. Is this a known limitation? > > Thanks, > Ron >
Re: Handoff on failure.
repartitioning is expensive. you don't want to do it as soon as a node goes down (which may be temporary, cassandra has no way of knowing). so repartitioning happens when decommission is done by a human. On Thu, Jun 10, 2010 at 10:37 PM, Sriram Srinivasan wrote: > I am looking at Cassandra 0.6.2's source code, and am unable to figure out > where, if at all, repartitioning happens in the case of failure. The > Gossiper's onDead message is ignored. Can someone please clarify this for > me? -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: New keyspace loading creates 'master'
I was unable to duplicate this problem using a 3 node cluster. Here were my steps: 1. bring up a seed node, give it a schema using loadSchemaFromYaml 2. bring up a second node. it received schema from the seed node. 3. bring the seed node down. 4. bring up a third node, but set it's seed to be the second node (this is important!). Is it possible in your testing that you only had one seed node (the original node), which is the node you shut down? If a node cannot contact a seed node, it never really joins the cluster and effectively becomes its own cluster of one node. In this case it follows that it would never receive schema definitions as well. If this isn't the case and you're still experiencing this, please let me know the steps in which you bring nodes up and down so I can replicate. Gary. On Fri, Jun 11, 2010 at 06:42, Gary Dusbabek wrote: > I've filed this as > https://issues.apache.org/jira/browse/CASSANDRA-1182. I've created > steps to reproduce based on your email and placed them in the ticket > description. Can you confirm that I've described things correctly? > > Gary. > > On Thu, Jun 10, 2010 at 17:16, Ronald Park wrote: >> Hi, >> >> We've been fiddling around with a small Cassandra cluster, bringing nodes up >> and down, to get a feel for how things are replicated and how spinning up a >> new node works (before having to learn it live :). We are using the trunk >> because we want to use the expiration feature. Along with that comes the >> new keyspace api to load the 'schema'. >> >> What we found was that, if the node on which we originally installed the >> keyspace was down when a new node is added, the new node does not get the >> keyspace schema. In some regards, it is now the 'master', at least in >> distributing the keyspace data. Is this a known limitation? >> >> Thanks, >> Ron >> >
Re: New keyspace loading creates 'master'
I've been trying a number of combinations of starting/stopping machines with and without various seeds and I can't recreate the problem now. I strongly suspect it was because we were trying to use the DatacenterShardStrategy code. Between changing snitches and shard classes, we haven't been able to recreate the problem so far. It is likely we had a misconfiguration of the DatacenterShardStrategy properties files too or a mismatch of snitches and strategies classes at the time. Sorry I didn't respond sooner so you didn't have to waste time doing this testing. :( However, perhaps Victor Jevdokimov might have a combination where this occurred as he stated in an earlier email. Ron Gary Dusbabek wrote: I was unable to duplicate this problem using a 3 node cluster. Here were my steps: 1. bring up a seed node, give it a schema using loadSchemaFromYaml 2. bring up a second node. it received schema from the seed node. 3. bring the seed node down. 4. bring up a third node, but set it's seed to be the second node (this is important!). Is it possible in your testing that you only had one seed node (the original node), which is the node you shut down? If a node cannot contact a seed node, it never really joins the cluster and effectively becomes its own cluster of one node. In this case it follows that it would never receive schema definitions as well. If this isn't the case and you're still experiencing this, please let me know the steps in which you bring nodes up and down so I can replicate. Gary. On Fri, Jun 11, 2010 at 06:42, Gary Dusbabek wrote: I've filed this as https://issues.apache.org/jira/browse/CASSANDRA-1182. I've created steps to reproduce based on your email and placed them in the ticket description. Can you confirm that I've described things correctly? Gary. On Thu, Jun 10, 2010 at 17:16, Ronald Park wrote: Hi, We've been fiddling around with a small Cassandra cluster, bringing nodes up and down, to get a feel for how things are replicated and how spinning up a new node works (before having to learn it live :). We are using the trunk because we want to use the expiration feature. Along with that comes the new keyspace api to load the 'schema'. What we found was that, if the node on which we originally installed the keyspace was down when a new node is added, the new node does not get the keyspace schema. In some regards, it is now the 'master', at least in distributing the keyspace data. Is this a known limitation? Thanks, Ron
Re: Handoff on failure.
Fair enough. But doesn't that mean that the node that comes up has the same token? I suppose the answer is that the auto bootstrap process is smart enough to figure out which range needs help. Thanks much, Jonathan. --sriram. On Jun 11, 2010, at 8:59 PM, Jonathan Ellis wrote: repartitioning is expensive. you don't want to do it as soon as a node goes down (which may be temporary, cassandra has no way of knowing). so repartitioning happens when decommission is done by a human. On Thu, Jun 10, 2010 at 10:37 PM, Sriram Srinivasan wrote: I am looking at Cassandra 0.6.2's source code, and am unable to figure out where, if at all, repartitioning happens in the case of failure. The Gossiper's onDead message is ignored. Can someone please clarify this for me? -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: Handoff on failure.
I'm not sure what you mean, but http://wiki.apache.org/cassandra/Operations may clear some things up. On Fri, Jun 11, 2010 at 7:49 PM, Sriram Srinivasan wrote: > Fair enough. But doesn't that mean that the node that comes up has the same > token? I suppose the answer is that the auto bootstrap process is smart > enough to figure out which range needs help. > > Thanks much, Jonathan. > --sriram. > > > On Jun 11, 2010, at 8:59 PM, Jonathan Ellis wrote: > >> repartitioning is expensive. you don't want to do it as soon as a >> node goes down (which may be temporary, cassandra has no way of >> knowing). so repartitioning happens when decommission is done by a >> human. >> >> On Thu, Jun 10, 2010 at 10:37 PM, Sriram Srinivasan >> wrote: >>> >>> I am looking at Cassandra 0.6.2's source code, and am unable to figure >>> out >>> where, if at all, repartitioning happens in the case of failure. The >>> Gossiper's onDead message is ignored. Can someone please clarify this for >>> me? >> >> >> >> -- >> Jonathan Ellis >> Project Chair, Apache Cassandra >> co-founder of Riptano, the source for professional Cassandra support >> http://riptano.com > > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: Handoff on failure.
I saw the operations page, but didn't get what I was looking for. What I meant by my earlier statement was that it is not clear to me who assigns the token to a new node; if a node goes down, and another process comes up, is (a) the token assigned to it automatically by consensus (and the bootstrap process gives it a place in the ring where it needs most load balancing), or, (b) it is an external script's responsibility to bootstrap every node with a new initialtoken? On Jun 12, 2010, at 8:46 AM, Jonathan Ellis wrote: I'm not sure what you mean, but http://wiki.apache.org/cassandra/Operations may clear some things up. On Fri, Jun 11, 2010 at 7:49 PM, Sriram Srinivasan wrote: Fair enough. But doesn't that mean that the node that comes up has the same token? I suppose the answer is that the auto bootstrap process is smart enough to figure out which range needs help.