Re: Compaction JMX Stats?

2010-05-21 Thread Anthony Molinaro
On Thu, May 20, 2010 at 02:11:23PM -0700, Jonathan Ellis wrote: > No, CM is not exposed to nodetool yet. (You should really be putting > metrics into a real monitoring system rather than relying on nodetool. > Some example munin plugins are at > http://github.com/jbellis/cassandra-munin-plugins,

Re: oom in ROW-MUTATION-STAGE

2010-05-21 Thread Jonathan Ellis
Can you monitor cassandra-level metrics like the ones in http://github.com/jbellis/cassandra-munin-plugins ? the usual culprit is usually compaction but your compacted row size is small. nothing else really comes to mind. (you should check system keyspace too tho, HH rows can get large) On Fri,

Re: An ORM like plugin for Cassandra under Datanucleus

2010-05-21 Thread Jonathan Ellis
Thanks! You should probably add it to http://wiki.apache.org/cassandra/ClientOptions. On Fri, May 21, 2010 at 12:22 PM, Pedro Gomes wrote: > Hi all > In the following weeks I have developed a plugin to the java persistence > platform Datanucleus, similar to the one presented by Google for App En

Re: Scaling problems

2010-05-21 Thread Jonathan Ellis
On Fri, May 21, 2010 at 9:09 AM, Ian Soboroff wrote: > HINTED-HANDOFF-POOL   1   158 23 this is your smoking gun. HH tasks suck a ton of CPU and you have 158 backed up. i would just blow the HH files away from your data/system directory, restart the node, and run rep

Re: Cassandra thrift question

2010-05-21 Thread Jonathan Ellis
Because when we tested it, it was slower. 2010/5/21 Даниел Симеонов : > Hi, >    I have a question about the thrift protocol used to connect to Cassandra, > I saw in class CassandraDaemon that TServerSocket is being used, > why TNonblockingServerSocket is not being used? Thank you very much! > Bes

Re: Problems running Cassandra 0.6.1 on large EC2 instances.

2010-05-21 Thread Curt Bererton
We can get Cassandra to run great for a few hours now. Writing to and reading from cassandra work well and the read/write times are good etc. We also changed our config to enable row caching (we're hoping to ditch our memcache server layer entirely). Unfortunately, running on an EC2 High Memory

Re: Problems running Cassandra 0.6.1 on large EC2 instances.

2010-05-21 Thread S Ahmed
curious how did things turn out? On Tue, May 18, 2010 at 1:38 PM, Curt Bererton wrote: > We only have a few CFs (6 or 7). I've increased the MemtableThroughputInMB > and MemtableOperationsInMillions as per your suggestions. Do we really > need a swap file though? I suppose it can't hurt, but wi

An ORM like plugin for Cassandra under Datanucleus

2010-05-21 Thread Pedro Gomes
Hi all In the following weeks I have developed a plugin to the java persistence platform Datanucleus, similar to the one presented by Google for App Engine and a Hbase already present in the platform . Datanucleus: http://www.datanucleus.org/project/download.html For now it allows the persist

Re: Scaling problems

2010-05-21 Thread Ian Soboroff
Ok, spending some time slogging through the cassandra-user archives. Seems lots of folks have this problem. Starting with a JVM upgrade, then skimming through JIRA looking for patches. Ian On Fri, May 21, 2010 at 12:09 PM, Ian Soboroff wrote: > So at the moment, I'm not running my loader, and

Any kind soul gotten cassandra working with jcollectd?

2010-05-21 Thread Curt Bererton
Hello all, We're running cassandra on EC2 (via Rightscale), and I'm working on connecting up the cassandra JMX monitoring to try to debug why our 0.6.1 Cassandra machine spike to 100% CPU and then choke after about 6 hours worth of production data running through them. We're running on 64 bit EC2

Re: Multiple hard disks configuration

2010-05-21 Thread Rob Coli
On Mon, May 17, 2010 at 10:39 PM, Ma Xiao wrote: Hi all, Recently we have a 5 nodes running cassandra, 4 X 1.5TB drives for each, ...then I put 4 paths with DataFileDirectory, my question is what's going to happen when one of the disk fail, especialy the one has os installed which also hold

Cassandra thrift question

2010-05-21 Thread Даниел Симеонов
Hi, I have a question about the thrift protocol used to connect to Cassandra, I saw in class CassandraDaemon that TServerSocket is being used, why TNonblockingServerSocket is not being used? Thank you very much! Best regards, Daniel.

Re: Scaling problems

2010-05-21 Thread Ian Soboroff
So at the moment, I'm not running my loader, and I'm looking at one node which is slow to respond to nodetool requests. At this point, it has a pile of hinted-handoffs pending which don't seem to be draining out. The system.log shows that it's GCing pretty much constantly. Ian $ /usr/local/src/

Re: Cause possible for a "cound not connect" TTransportException.NOT_OPEN , during load datas ?

2010-05-21 Thread xavier manach
Thanks. You resolved my Problem. My error : I didn't see transport.open, open a new socket for each call. I did think it's reuse the same one. 2010/5/20 Jonathan Ellis > "disseminating load info" is not related to your problem. > > certainly you should be using connection pooling rather th

Re: delete mutation

2010-05-21 Thread Mark Greene
Oh ok thanks guys. That all makes perfect sense. I mistakenly thought the timestamp was needed as a discriminator for the delete. On Fri, May 21, 2010 at 10:24 AM, Jonathan Ellis wrote: > ts is when the delete is performed, not the ts of the column you're > deleting. > > you need to provide a

Re: how to decommission two slow nodes?

2010-05-21 Thread Ran Tavory
Thanks, I'll try that next time. On May 21, 2010 5:23 PM, "Jonathan Ellis" wrote: There is no other way to make the cluster "forget" a node w/o decommission / removetoken. You could do everything up to "stop the entire cluster" and do a rolling restart instead, kill the 2 nodes you want to rem

Re: Scaling problems

2010-05-21 Thread Ian Soboroff
On the to-do list for today. Is there a tool to aggregate all the JMX stats from all nodes? I mean, something a little more complete than nagios. Ian On Fri, May 21, 2010 at 10:23 AM, Jonathan Ellis wrote: > you should check the jmx stages I posted about > > On Fri, May 21, 2010 at 7:05 AM, I

Re: delete mutation

2010-05-21 Thread Jonathan Ellis
ts is when the delete is performed, not the ts of the column you're deleting. you need to provide a ts for every operation so that if there are multiple clients updating the same column at the same time, cassandra can decide who "wins." On Fri, May 21, 2010 at 6:55 AM, Mark Greene wrote: > Is th

Re: Scaling problems

2010-05-21 Thread Jonathan Ellis
you should check the jmx stages I posted about On Fri, May 21, 2010 at 7:05 AM, Ian Soboroff wrote: > Just an update.  I rolled the memtable size back to 128MB.  I am still > seeing that the daemon runs for a while with reasonable heap usage, but then > the heap climbs up to the max (6GB in this

Re: how to decommission two slow nodes?

2010-05-21 Thread Jonathan Ellis
There is no other way to make the cluster "forget" a node w/o decommission / removetoken. You could do everything up to "stop the entire cluster" and do a rolling restart instead, kill the 2 nodes you want to remove, and then do removetoken, which would still do extra i/o but at least the slow nod

Re: What happened if one server involved in the process of data reading fail?

2010-05-21 Thread Jonathan Ellis
http://wiki.apache.org/cassandra/ArchitectureInternals 2010/5/20 史英杰 : > What inner mechanism does Cassandra adopt to get this kind of fault > tolerance? > > 2010/5/20 Simon Smith >> >> On Thu, May 20, 2010 at 8:08 AM, 史英杰 wrote: >> > Hi, All, >> > I am now learning the mechanism Cassandra a

Re: delete mutation

2010-05-21 Thread Brandon Williams
On Fri, May 21, 2010 at 8:55 AM, Mark Greene wrote: > Is there a particular reason why timestamp is required to do a deletion? Because a delete is just a write with a tombstone flag, and the write with the highest timestamp wins. > If i'm reading the api docs correctly, this would require a r

Re: Scaling problems

2010-05-21 Thread Ian Soboroff
Just an update. I rolled the memtable size back to 128MB. I am still seeing that the daemon runs for a while with reasonable heap usage, but then the heap climbs up to the max (6GB in this case, should be plenty) and it starts GCing, without much getting cleared. The client catches lots of excep

delete mutation

2010-05-21 Thread Mark Greene
Is there a particular reason why timestamp is required to do a deletion? If i'm reading the api docs correctly, this would require a read of the column first correct? I know there is an issue filed to have a better way to delete via range slices but I wanted to make sure this was the only way to

Re: New Changes in Cass 0.7 Thrift API Interface

2010-05-21 Thread Gary Dusbabek
On Thu, May 20, 2010 at 22:16, Arya Goudarzi wrote: > > P.S. By the way, if someone grants me access, I'd like to contribute to the > documentaions on Apache Cassandra. > I believe anybody can create a wiki account and make changes. Have at it! Gary.