Re: 0.7.4: Replication assertion error after removetoken, removetoken force and a restart

2011-08-21 Thread aaron morton
There is some confusion in the ring about nodes leaving. Check nodetool ring from every node and see if they agree. Check the logs to see if there is any information about node is sending the wrong message. Without knowing much more you could try a rolling restart, but you may need a full res

Re: 0.7.4: Replication assertion error after removetoken, removetoken force and a restart

2011-08-20 Thread Anand Somani
0.7.4/ 3 node cluster/ RF -3 /Quorum read/write After I re-introduced a corrupted node, followed the process as (thanks to folks on the mailing list for helping me) listed on the operations wiki to handle failures. Still doing a cleanup on one node at this point. But I noticed that I am seeing thi

Re: 0.7.4: Replication assertion error after removetoken, removetoken force and a restart

2011-04-28 Thread aaron morton
I *think* that code is used when one node tells others via gossip it is removing a token that is not it's own. The ode that receives information in gossip does some work and then replies to the first node with a REPLICATION_FINISHED message, which is the node I assume the error is happening on.

0.7.4: Replication assertion error after removetoken, removetoken force and a restart

2011-04-27 Thread Alexis Lê-Quôc
Hi, I've been getting the following lately, every few seconds. 2011-04-27T20:21:18.299885+00:00 10.202.61.193 [MiscStage: 97] Error in ThreadPoolExecutor 2011-04-27T20:21:18.299885+00:00 10.202.61.193 java.lang.AssertionError 2011-04-27T20:21:18.300038+00:00 10.202.61.193 10.202.61.193 at org.a