Re: Odd ring problems with 0.5.1

2010-04-23 Thread Anthony Molinaro
Turns out I needed to shut everything down completely, then start it all up a rolling restart was still resulting in some nodes being confused about what ring they were in. I think the moral of all this, is any changes to the seed node must result in a full restart of your cluster. Also any use o

Re: Odd ring problems with 0.5.1

2010-04-23 Thread Anthony Molinaro
On Fri, Apr 23, 2010 at 01:17:21PM -0500, Jonathan Ellis wrote: > On Fri, Apr 23, 2010 at 1:12 PM, Anthony Molinaro > wrote: > > I'm not sure how it would get this, maybe I need to restart my seed node? > > It's worth a try. Sounds like you found an unusual bug in gossip. Damn, restarting the

Re: Odd ring problems with 0.5.1

2010-04-23 Thread Jonathan Ellis
On Fri, Apr 23, 2010 at 1:12 PM, Anthony Molinaro wrote: > I'm not sure how it would get this, maybe I need to restart my seed node? It's worth a try. Sounds like you found an unusual bug in gossip. > When I run nodeprobe ring on the seed I don't see any of the hosts I > decommissioned, but may

Re: Odd ring problems with 0.5.1

2010-04-23 Thread Anthony Molinaro
On Fri, Apr 23, 2010 at 12:41:17PM -0500, Jonathan Ellis wrote: > On Fri, Apr 23, 2010 at 12:30 PM, Anthony Molinaro > wrote: > > Some nodes appear in the ring from some nodes, but not others.  Right > > now I have 14 nodes, 10 of those nodes have the same output of a > > nodeprobe ring, the othe

Re: Odd ring problems with 0.5.1

2010-04-23 Thread Jonathan Ellis
On Fri, Apr 23, 2010 at 12:30 PM, Anthony Molinaro wrote: > Some nodes appear in the ring from some nodes, but not others.  Right > now I have 14 nodes, 10 of those nodes have the same output of a > nodeprobe ring, the other 4 are missing one node. What's the history of the missing node? Is it a

Odd ring problems with 0.5.1

2010-04-23 Thread Anthony Molinaro
So I've been trying to migrate off of old ec2 m1.large nodes onto xlarge nodes so I can get enough breathing room to then do an upgrade to 0.6.x (I can't keep the large nodes up long enough, so I spend all my time restarting and trying to move data, so can get all the packages I would need for 0.6.