Re: Consistency Level throughput

2011-05-26 Thread Ryu Kobayashi
My question is my throughput per case. > In general, cluster throughput = single node throughput * number of > nodes / replication factor. Yes, I think so too. But I really want to ask is there are no results. Could you look at the chart I made it? http://goo.gl/mACQa 2011/5/27 Maki Watanabe

Re: Forcing Cassandra to free up some space

2011-05-26 Thread Jeremy Hanna
For the purposes of clearing out disk space, you might also occasionally check to see if you have snapshots that you no longer need. Certain operations create snapshots (point-in-time backups of sstables) in the (default) /var/lib/cassandra/data//snapshots directory. If you are absolutely sure

Re: Consistency Level throughput

2011-05-26 Thread Maki Watanabe
I assume your question is on that "how CL will affects on the throughput". In theory, I believe CL will not affect on the throughput of the Cassandra system. In any CL, the coordinator node needs to submit write/read requests along the RF specified for the KS. But for the latency, CL will affects

Re: Consistency Level throughput

2011-05-26 Thread Jonathan Ellis
I'm afraid I don't quite understand the question. In general, cluster throughput = single node throughput * number of nodes / replication factor. On Thu, May 26, 2011 at 9:39 PM, Ryu Kobayashi wrote: > Hi, > > Question of Consistency Level throughput. > > Environment: > 6 nodes. Replication fact

ghost node?

2011-05-26 Thread jonathan . colby
A node with IP 10.46.108.102 was removed from the cluster several days ago but the cassandra logs are full of these messages! Anyone know how to permanently remove this information? I\m beginning to think it is affecting the throughput of the live ndes. INFO [FlushWriter:1] 2011-05-27 04:28

Consistency Level throughput

2011-05-26 Thread Ryu Kobayashi
Hi, Question of Consistency Level throughput. Environment: 6 nodes. Replication factor is 3. ONE and QUORUM it was not for the throughput difference. ALL just extremely slow. Not ONE had only half the throughput. ONE, TWO and THREE were similar results. Is there any difference between 2 nodes a

Re: Forcing Cassandra to free up some space

2011-05-26 Thread Jeffrey Kesselman
Im also not sure that will guarantee all space is cleaned up. It really depends on what you are doing inside Cassandra. If you have your on garbage collect that is just in some way tied to the gc run, then it will run when it runs. If otoh you are associating records in your storage with specif

Re: Forcing Cassandra to free up some space

2011-05-26 Thread Jeffrey Kesselman
You really should qualify that with "on all currently known versions of Hotspot" Not trying to give you grief, really, but its an important limitation to understand. On Thu, May 26, 2011 at 10:16 PM, Jonathan Ellis wrote: > In summary, system.gc works fine unless you've deliberately done > some

Re: Forcing Cassandra to free up some space

2011-05-26 Thread Jonathan Ellis
In summary, system.gc works fine unless you've deliberately done something like setting the -XX:-DisableExplicitGC flag. On Thu, May 26, 2011 at 5:58 PM, Konstantin Naryshkin wrote: > So, in summary, there is no way to predictably and efficiently tell Cassandra > to get rid of all of the extra

Re: Forcing Cassandra to free up some space

2011-05-26 Thread Jake Luciani
"Is there a way for me to make (or even gently suggest to) Cassandra that it may be a good time to free up some space?" Disregarding what's been said and until ref-counting is implemented this is a useful tool to gently suggest cleanup: https://github.com/ceocoder/jmxgc On Thu, May 26, 2011 at

Re: Forcing Cassandra to free up some space

2011-05-26 Thread Jeffrey Kesselman
Not if it depends on a side effect of garbage collection such as finalizers It aught to publish its own JMX control to cause that to happen. On Thu, May 26, 2011 at 6:58 PM, Konstantin Naryshkin wrote: > So, in summary, there is no way to predictably and efficiently tell Cassandra > to get r

Re: Forcing Cassandra to free up some space

2011-05-26 Thread Konstantin Naryshkin
So, in summary, there is no way to predictably and efficiently tell Cassandra to get rid of all of the extra space it is using on disk? - Original Message - From: "Jeffrey Kesselman" To: user@cassandra.apache.org Sent: Thursday, May 26, 2011 8:57:49 PM Subject: Re: Forcing Cassandra to f

Re: Re: nodetool move trying to stream data to node no longer in cluster

2011-05-26 Thread jonathan . colby
Hi Aaron - Thanks alot for the great feedback. I'll try your suggestion on removing it as an endpoint with jmx. On , aaron morton wrote: Off the top of my head the simple way to stop invalid end point state been passed around is a full cluster stop. Obviously thats not an option. The probl

Re: EC2 node adding trouble

2011-05-26 Thread aaron morton
This is the *most* useful page on the wiki http://wiki.apache.org/cassandra/Operations Hope that helps. - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 27 May 2011, at 02:06, Marcus Bointon wrote: > On 26 May 2011, at 15:21, Sasha Do

Re: EC2 node adding trouble

2011-05-26 Thread aaron morton
This ticket may be just the ticket :) https://issues.apache.org/jira/browse/CASSANDRA-2452 Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 27 May 2011, at 01:16, Sasha Dolgy wrote: > As an aside, you can also use that command to

Re: nodetool move trying to stream data to node no longer in cluster

2011-05-26 Thread aaron morton
Off the top of my head the simple way to stop invalid end point state been passed around is a full cluster stop. Obviously thats not an option. The problem is if one node has the IP is will share it around with the others. Out of interest take a look at the o.a.c.db.FailureDetector MBean getA

Re: Forcing Cassandra to free up some space

2011-05-26 Thread Jeffrey Kesselman
Which JVM? Which collector? There have been and continue to be many. Hotspot itself supports a number of different collectors with different behaviors. Many of them do not collect every candidate on every gc, but merely the easiest ones to find. This is why depending on finalizers is a *bad*

Re: OOM recovering failed node with many CFs

2011-05-26 Thread Jonathan Ellis
We've applied a fix to the 0.7 branch in https://issues.apache.org/jira/browse/CASSANDRA-2714. The patch probably applies to 0.7.6 as well. On Thu, May 26, 2011 at 11:36 AM, Flavio Baronti wrote: > I tried the manual copy you suggest, but the SystemTable.checkHealth() > function > complains it c

Re: Forcing Cassandra to free up some space

2011-05-26 Thread Jonathan Ellis
I've read the relevant source. While you're pedantically correct re the spec, you're wrong as to what the JVM actually does. On Thu, May 26, 2011 at 3:14 PM, Jeffrey Kesselman wrote: > Some references... > > "An object enters an unreachable state when no more strong references > to it exist. When

Re: Forcing Cassandra to free up some space

2011-05-26 Thread Jeffrey Kesselman
Some references... "An object enters an unreachable state when no more strong references to it exist. When an object is unreachable, it is a candidate for collection. Note the wording: Just because an object is a candidate for collection doesn't mean it will be immediately collected. The JVM is fr

Re: Forcing Cassandra to free up some space

2011-05-26 Thread Jeffrey Kesselman
Im sorry. This was my business at Sun. You are certainly wrong about the Hotspot VM. See this chapter of my book http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.html#998394 On Thu, May 26, 2011 at 3:53 PM, Jonathan Ellis wrote: > It's a common misunderstanding that

Re: Forcing Cassandra to free up some space

2011-05-26 Thread Jonathan Ellis
It's a common misunderstanding that system.gc is only a suggestion; on any VM you're likely to run Cassandra on, System.gc will actually invoke a full collection. On Thu, May 26, 2011 at 2:18 PM, Jeffrey Kesselman wrote: > Actually this is no gaurantee.   Its a common misunderstanding that > Syst

Re: Forcing Cassandra to free up some space

2011-05-26 Thread Jeffrey Kesselman
Actually this is no gaurantee. Its a common misunderstanding that System.gc "forces" gc. It does not. It is a suggestion only. The vm always has the option as to when and how much it gcs On May 26, 2011 2:51 PM, "Jonathan Ellis" wrote:

Re: PHP CQL Driver

2011-05-26 Thread Kwasi Gyasi - Agyei
yep, works perfectly @ http://caqel.deadcafe.org/ I will try my luck @ phpcassa. Thanks for your time gentlemen. On Thu, May 26, 2011 at 8:59 PM, Sasha Dolgy wrote: > maybe you'd have more luck discussing this on the phpcassa list? > https://groups.google.com/forum/#!forum/phpcassa > > more ex

Re: PHP CQL Driver

2011-05-26 Thread Sasha Dolgy
maybe you'd have more luck discussing this on the phpcassa list? https://groups.google.com/forum/#!forum/phpcassa more experience there with PHP and Cassandra ... Are you able to validate the query works when not using PHP? On Thu, May 26, 2011 at 8:51 PM, Kwasi Gyasi - Agyei wrote: > got syste

Re: PHP CQL Driver

2011-05-26 Thread Kwasi Gyasi - Agyei
got system in debug mode the following query fails --- CREATE COLUMNFAMILY magic (KEY text PRIMARY KEY, monkey ) WITH comparator = text AND default_validation = text PHP error reads - #0 /Volumes/DATA/Project/libs/php/phpCQL/vendor/cassandra/c

Re: Forcing Cassandra to free up some space

2011-05-26 Thread Jonathan Ellis
You'd have to call system.gc via JMX. https://issues.apache.org/jira/browse/CASSANDRA-2521 is open to address this, btw. On Thu, May 26, 2011 at 1:09 PM, Konstantin Naryshkin wrote: > I have a basic understanding of how Cassandra handles the file system > (flushes in Memtables out to SSTables,

Forcing Cassandra to free up some space

2011-05-26 Thread Konstantin Naryshkin
I have a basic understanding of how Cassandra handles the file system (flushes in Memtables out to SSTables, SSTables get compacted) and I understand that old files are only deleted when a node is restarted, when Java does a GC, or when Cassandra feels like it is running out of space. My quest

Re: OOM recovering failed node with many CFs

2011-05-26 Thread Flavio Baronti
I tried the manual copy you suggest, but the SystemTable.checkHealth() function complains it can't load the system files. Log follows, I will gather some more info and create a ticket as soon as possible. INFO [main] 2011-05-26 18:25:36,147 AbstractCassandraDaemon.java Logging initialized INFO

Re: OOM recovering failed node with many CFs

2011-05-26 Thread Jonathan Ellis
Sounds like a legitimate bug, although looking through the code I'm not sure what would cause a tight retry loop on migration announce/rectify. Can you create a ticket at https://issues.apache.org/jira/browse/CASSANDRA ? As a workaround, I would try manually copying the Migrations and Schema sstab

OOM recovering failed node with many CFs

2011-05-26 Thread Flavio Baronti
I can't seem to be able to recover a failed node on a database where i did many updates to the schema. I have a small cluster with 2 nodes, around 1000 CF (I know it's a lot, but it can't be changed right now), and ReplicationFactor=2. I shut down a node and cleaned its data entirely, then trie

Re: EC2 node adding trouble

2011-05-26 Thread Marcus Bointon
On 26 May 2011, at 15:21, Sasha Dolgy wrote: > Turn the node off, remove the node from the ring using nodetool and > removetoken i've found this to be the best problem-free way. > Maybe it's better now ... > http://blog.sasha.dolgy.com/2011/03/apache-cassandra-nodetool.html So I'd need to ha

Re: EC2 node adding trouble

2011-05-26 Thread Sasha Dolgy
On Thu, May 26, 2011 at 3:12 PM, Marcus Bointon wrote: > I'd like to make sure I've got the right sequence of operations for adding a > node without downtime. If I'm going from 2 to 3 nodes: > > 1 Calculate new initial_token values using the python script > 2 Change token values in existing nodes

Re: EC2 node adding trouble

2011-05-26 Thread Sasha Dolgy
As an aside, you can also use that command to pull meta-data about instances in AWS. I have implemented this to maintain a list of seed nodes. This way, when a new instance is brought online, the default cassandra.yaml is `enhanced` to contain a dynamic list of valid seeds, proper hostname and a

Re: EC2 node adding trouble

2011-05-26 Thread Marcus Bointon
On 24 May 2011, at 23:58, Sameer Farooqui wrote: > So, once you know what token each of the 3 nodes should have, shut down the > first two nodes, change their tokens and add the correct token to the 3rd > node (in the YAML file). I'd like to make sure I've got the right sequence of operations f

Re: EC2 node adding trouble

2011-05-26 Thread Marcus Bointon
Thanks for all your helpful suggestions - I've now got it working. It was down to a combination of things. 1. A missing rule in a security group 2. A missing DNS name for the new node, so its default name was defaulting to localhost 3. Google DNS caching the failed DNS lookup for the full durati

Re: Corrupted Counter Columns

2011-05-26 Thread Utku Can Topçu
Some additional information on the settings: I'm using CL.ONE for both reading and writing; and replicate_on_write is true on the Counters CF. I think the problem occurs after a restart when the commitlogs are read. On Thu, May 26, 2011 at 2:21 PM, Utku Can Topçu wrote: > Hello, > > I'm using

Corrupted Counter Columns

2011-05-26 Thread Utku Can Topçu
Hello, I'm using the the 0.8.0-rc1, with RF=2 and 4 nodes. Strangely counters are corrupted. Say, the actual value should be : 51664 and the value that cassandra sometimes outputs is: either 51664 or 18651001. And I have no idea on how to diagnose the problem or reproduce it. Can you help me in

Re: Priority queue in a single row - performance falls over time

2011-05-26 Thread Paul Loy
persistent [priority] queues are better suited to something like HornetQ than Cassandra. On Wed, May 25, 2011 at 9:10 PM, Dan Kuebrich wrote: > It sounds like the problem is that the row is getting filled up with > tombstones and becoming enormous? Another idea then, which might not be > worth t

Re: How to programmatically index an existed column?

2011-05-26 Thread Dikang Gu
Hi Aaron, Thank you for your reminder. I've found out the solution myself, and I share it here: KeyspaceDefinition keyspaceDefinition = cluster.describeKeyspace(KEYSPACE); ColumnFamilyDefinition cdf = keyspaceDefinition.getCfDefs().get(0); BasicColumnFamilyDefinition columnFamilyDefinition = ne

Re: nodetool move trying to stream data to node no longer in cluster

2011-05-26 Thread Jonathan Colby
@Aaron - Unfortunately I'm still seeing message like: " is down", removing from gossip, although with not the same frequency. And repair/move jobs don't seem to try to stream data to the removed node anymore. Anyone know how to totally purge any stored gossip/endpoint data on nodes that we

Re: EC2 node adding trouble

2011-05-26 Thread Marcus Bointon
On 26 May 2011, at 00:17, aaron morton wrote: > I've seen discussion of using the EIP but I do not have direct experience. The idea is not to use the external IP, but the external DNS name because of this very useful trick (please excuse me if you already know this!): Say the DNS name of an el

Re: PHP CQL Driver

2011-05-26 Thread aaron morton
Cool, this may be a better discussion for the client-dev list http://www.mail-archive.com/client-dev@cassandra.apache.org/ I would start by turning up the server logging to DEBUG and watching your update / select queries. Cheers - Aaron Morton Freelance Cassandra Developer @aar

Re: How to programmatically index an existed column?

2011-05-26 Thread aaron morton
Please post to one list at a time. Otherwise people may spend their time helping you when someone already has. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 26 May 2011, at 17:35, Dikang Gu wrote: > > I want to build a second

Re: nodetool move trying to stream data to node no longer in cluster

2011-05-26 Thread aaron morton
cool. I was going to suggest that but as you already had the move running I thought it may be a little drastic. Did it show any progress ? If the IP address is not responding there should have been some sort of error. Cheers - Aaron Morton Freelance Cassandra Developer @aaro