unexpected behaviour on seed nodes when using -Dcassandra.replace_token

2012-10-19 Thread Thomas van Neerijnen
Hi all I recently tried to replace a dead node using -Dcassandra.replace_token=, which so far has been good to me. However on one of my nodes this option was ignored and the node simply picked a different token to live at and started up there. It was a foolish mistake on my part because it was se

Re: replaced node keeps returning in gossip

2012-10-19 Thread Thomas van Neerijnen
> Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 19/10/2012, at 2:44 AM, Thomas van Neerijnen > wrote: > > Hi all > > I'm running Cassandra 1.0.11 on Ubuntu 11.10. > > I've got a ghost node which keeps showing up o

replaced node keeps returning in gossip

2012-10-18 Thread Thomas van Neerijnen
Hi all I'm running Cassandra 1.0.11 on Ubuntu 11.10. I've got a ghost node which keeps showing up on my ring. A node living on IP 10.16.128.210 and token 0 died and had to be replaced. I replaced it with a new node, IP 10.16.128.197 and again token 0 with a "-Dcassandra.replace_token=0" at start

Re: java.lang.NoClassDefFoundError when trying to do anything on one CF on one node

2012-09-10 Thread Thomas van Neerijnen
nfo CF SSTables should fix it. > > I created https://issues.apache.org/jira/browse/CASSANDRA-4626 can you > please add more information there if you can and/or watch the ticket incase > there are other questions. > > Thanks > > > ----- > Aaron Morton > F

Re: java.lang.NoClassDefFoundError when trying to do anything on one CF on one node

2012-09-05 Thread Thomas van Neerijnen
forgot to answer your first question. I see this: INFO 14:31:31,896 No saved local node id, using newly generated: 92109b80-ea0a-11e1--51be601cd0af On Wed, Sep 5, 2012 at 8:41 AM, Thomas van Neerijnen wrote: > Thanks for the help Aaron. > I've checked NodeIdInfo and LocationIn

Re: java.lang.NoClassDefFoundError when trying to do anything on one CF on one node

2012-09-05 Thread Thomas van Neerijnen
d NodeInfo cfs. > * restart > > Note this will read the token from the yaml file again, so make sure it's > right. > > cheers > > - > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 4/09/2012, at

java.lang.NoClassDefFoundError when trying to do anything on one CF on one node

2012-09-04 Thread Thomas van Neerijnen
Hi I have a single node in a 6 node Cassandra 1.0.11 cluster that seems to have a single column family in a weird state. Repairs, upgradesstables, anything that touches this CF crashes. I've drained the node, removed every file for this CF from said node, removed the commit log, started it up and

Why does a large compaction on one node affect the entire cluster?

2012-05-24 Thread Thomas van Neerijnen
Hi all I am running Cassandra 1.0.10 installed from the apache debs on ubuntu 11.10 on a 7 node cluster. I moved some tokens around my cluster and now have one node compacting a large Leveled compaction column family. It has done about 5k out of 10k outstanding compactions today. The other nodes

Cassandra CF merkle tree

2012-04-02 Thread Thomas van Neerijnen
Hi all Is there a way I can easily retrieve a Merkle tree for a CF, like the one created during a repair? I didn't see anything about this in the Thrift API docs, I'm assuming this is a data structure made available only to internal Cassandra functions. I would like to explore using the Merkle tr

Re: ReplicateOnWriteStage exception causes a backlog in MutationStage that never clears

2012-03-23 Thread Thomas van Neerijnen
holding our replicas. One of the replicas would eventually fail under the pressure and the rest of the cluster would try holding hints for the bad keys writes, which would cause the same problem on the rest of the cluster. On Thu, Mar 22, 2012 at 1:55 AM, Thomas van Neerijnen wrote: > Hi >

Re: Network, Compaction, Garbage collection and Cache monitoring in cassandra

2012-03-21 Thread Thomas van Neerijnen
Collectd with GenericJMX pushing data into Graphite is what we use. You can monitor the Graphite graphs directly instead of having an extra JMX interface on the Cassandra nodes for monitoring. On Wed, Mar 21, 2012 at 8:16 PM, Jeremiah Jordan < jeremiah.jor...@morningstar.com> wrote: > You can al

Re: ReplicateOnWriteStage exception causes a backlog in MutationStage that never clears

2012-03-21 Thread Thomas van Neerijnen
ing up and down a lot ? Are they under GC pressure. The > other possibility is that you have overloaded the cluster. > > Cheers > > > - > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 22/03/2012, at 3:20 AM, T

ReplicateOnWriteStage exception causes a backlog in MutationStage that never clears

2012-03-21 Thread Thomas van Neerijnen
Hi all I'm running into a weird error on Cassandra 1.0.7. As my clusters load gets heavier many of the nodes seem to hit the same error around the same time, resulting in MutationStage backing up and never clearing down. The only way to recover the cluster is to kill all the nodes and start them u

Re: Single Node Cassandra Installation

2012-03-16 Thread Thomas van Neerijnen
hanks for the comments, I guess I will end up doing a 2 node cluster with > replica count 2 and read consistency 1. > > -- Drew > > > > On Mar 15, 2012, at 4:20 PM, Thomas van Neerijnen wrote: > > So long as data loss and downtime are acceptable risks a one node cluster

Re: Single Node Cassandra Installation

2012-03-15 Thread Thomas van Neerijnen
So long as data loss and downtime are acceptable risks a one node cluster is fine. Personally this is usually only acceptable on my workstation, even my dev environment is redundant, because servers fail, usually when you least want them to, like for example when you've decided to save costs by wai

Re: 1.0.8 with Leveled compaction - Possible issues

2012-03-15 Thread Thomas van Neerijnen
Heya I'd suggest staying away from Leveled Compaction until 1.0.9. For the why see this great explanation I got from Maki Watanabe on the list: http://mail-archives.apache.org/mod_mbox/cassandra-user/201203.mbox/%3CCALqbeQbQ=d-hORVhA-LHOo_a5j46fQrsZMm+OQgfkgR=4rr...@mail.gmail.com%3E Keep an eye o

Re: cleanup crashing with "java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: 8"

2012-03-14 Thread Thomas van Neerijnen
jira/browse/CASSANDRA-3989 > > You should better to avoid to use cleanup/scrub/upgradesstable if you > can on 1.0.7 though > it will not corrupt sstables. > > 2012/3/14 Thomas van Neerijnen : > > Hi all > > > > I am trying to run a cleanup on a column family and am g

Re: LeveledCompaction and/or SnappyCompressor causing memory pressure during repair

2012-03-14 Thread Thomas van Neerijnen
Thanks for the suggestions but I'd already removed the compression when your message came thru. That alleviated the problem but didn't solve it. I'm still looking at a few other possible causes, I'll post back if I work out what's going on, for now I am running rolling repairs to avoid another outa

cleanup crashing with "java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: 8"

2012-03-14 Thread Thomas van Neerijnen
Hi all I am trying to run a cleanup on a column family and am getting the following error returned after about 15 seconds. A cleanup on a slightly smaller column family completes in about 21 minutes. This is on the Apache packaged version of Cassandra on Ubuntu 11.10, version 1.0.7. ~# nodetool -

LeveledCompaction and/or SnappyCompressor causing memory pressure during repair

2012-03-08 Thread Thomas van Neerijnen
Hi all Running Cassandra 1.0.7, I recently changed a few read heavy column families from SizeTieredCompactionStrategy to LeveledCompactionStrategy and added in SnappyCompressor, all with defaults so 5MB files and if memory serves me correctly 64k chunk size for compression. The results were amazin

Re: "Final buffer length 4690 to accomodate data size of 2347 for RowMutation" error caused node death

2012-03-07 Thread Thomas van Neerijnen
' and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'; On Fri, Feb 24, 2012 at 10:07 PM, Jonathan Ellis wrote: > I've filed https://issues.apache.org/jira/browse/CASSANDRA-3957 as a > bug. Any further light you can shed here wou

"Final buffer length 4690 to accomodate data size of 2347 for RowMutation" error caused node death

2012-02-20 Thread Thomas van Neerijnen
Hi all I am running the Apache packaged Cassandra 1.0.7 on Ubuntu 11.10. It has been running fine for over a month however I encountered the below error yesterday which almost immediately resulted in heap usage rising quickly to almost 100% and client requests timing out on the affected node. I ga