Re: Grep for what?
ps auxww | grep cassandra (and upgrade to 0.6) On Mon, Apr 5, 2010 at 10:20 AM, JEFFERY SCHMITZ wrote: > Warning this is a newbie question - > > On startup I get > > [r...@marduk bin]# ./cassandra -f > Listening for transport dt_socket at address: > INFO - Sampling index for > /var/lib/cassandra/data/system/LocationInfo-1-Data.db > INFO - Replaying /var/lib/cassandra/commitlog/CommitLog-1270047913578.log > INFO - LocationInfo has reached its threshold; switching in a fresh Memtable > INFO - Enqueuing flush of Memtable(LocationInfo)@2132679615 > INFO - Sorting Memtable(LocationInfo)@2132679615 > INFO - Writing Memtable(LocationInfo)@2132679615 > INFO - Completed flushing > /var/lib/cassandra/data/system/LocationInfo-2-Data.db > INFO - Log replay complete > INFO - Saved Token found: 75598148438183751486026363636316999593 > INFO - Starting up server gossip > INFO - Cassandra starting up... > > Okay so its running I can't figure out what the PID is?? - grepping for > cassandra or apache turns up zip - > > Thanks - > > Jeffery > >
Re: boonfilters
Please read this: http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html On Wed, Apr 7, 2010 at 1:27 PM, S Ahmed wrote: > Just reading up on boonfilters, few questions. > > Basically boonfilters let give you a true/false if a particular key exists, > and they *may* give you a false positive i.e. they key exists but never a > false negative i.e. the key doesn't exist. > > The core of boonfilters is its hashing mechanism that marks the in-memory > matrix/map if the key exists. > > 1. Is the only place boonfilters are used in Cassandra is when you want to > see if a particular key exists in a particular node? > > 2. Are boonfilters a fixed size, or they adjust as to the # of keys? any > example size? > > 3. Boonfilters don't give false negatives: > So you hit a node, and perform a lookup in the boonfilter for a key. It > says "yes", but when you do a lookup the object returned is null, so then > you flag that this node needs this particular key during replication. > > > Have I grasp this concept? > > Really loving this project, learning allot from the code. It would be great > if someone could do a walkthrough of common functionality in a detailed way > :) >
Re: boonfilters
(Should mention: suggesting reading the Dynamo paper for general background, not for Bloom filters, which are fantastically covered in the Wikipedia entry). On Wed, Apr 7, 2010 at 4:11 PM, Benjamin Black wrote: > Please read this: > http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html > > On Wed, Apr 7, 2010 at 1:27 PM, S Ahmed wrote: >> Just reading up on boonfilters, few questions. >> >> Basically boonfilters let give you a true/false if a particular key exists, >> and they *may* give you a false positive i.e. they key exists but never a >> false negative i.e. the key doesn't exist. >> >> The core of boonfilters is its hashing mechanism that marks the in-memory >> matrix/map if the key exists. >> >> 1. Is the only place boonfilters are used in Cassandra is when you want to >> see if a particular key exists in a particular node? >> >> 2. Are boonfilters a fixed size, or they adjust as to the # of keys? any >> example size? >> >> 3. Boonfilters don't give false negatives: >> So you hit a node, and perform a lookup in the boonfilter for a key. It >> says "yes", but when you do a lookup the object returned is null, so then >> you flag that this node needs this particular key during replication. >> >> >> Have I grasp this concept? >> >> Really loving this project, learning allot from the code. It would be great >> if someone could do a walkthrough of common functionality in a detailed way >> :) >> >
set_keyspace()
Can someone enlighten me as to the purpose of set_keyspace() and the elimination of the keyspace args from calls? I understand there was a discussion of the issue before I joined the list several months ago. For those with several keyspaces, or many keyspaces, as when using a keyspace per customer, it is a huge step backwards. The change will require either a significant increase in the number of calls or a complete redesign and implementation of connection management. Neither is attractive. Some insight into this decision would be appreciated. b
Re: 3-node balanced system
in ruby: def token(nodes) 1.upto(nodes) {|i| p (2**127/nodes) * i}; end >> token(3) 56713727820156410577229101238628035242 113427455640312821154458202477256070484 170141183460469231731687303715884105726 On Thu, Jun 17, 2010 at 12:08 PM, Lev Stesin wrote: > Hi, > > What is the correct procedure to create a well balanced cluster (in > terms of key distribution). From what I understand whenever I add a > new node its takes half from its neighbor. How can I make each node to > contain 1/3 of the keys in a 3 node cluster? Thanks. > > -- > Lev >
Re: cassandra increment counters, Jira #1072
On Thu, Aug 12, 2010 at 10:23 AM, Kelvin Kakugawa wrote: > > I think the underlying unanswered question is whether #1072 is a niche > feature or whether it should be brought into trunk. > This should not be an unanswered question! #1072 should be considered essential, as it enables numerous use cases that currently require bolting something like memcache or redis onto the side to handle counters. +1 on getting this into trunk ASAP. b
Re: cassandra increment counters, Jira #1072
On Thu, Aug 12, 2010 at 8:54 PM, Jonathan Ellis wrote: > There are two concerns that give me pause. > > The first is that 1072 is tackling a use case that Cassandra already > handles well: high volume of writes to a counter, with low volume > reads. (This can be done by inserting uuids into a counter row, and > aggregating them either in the background or at read time or with some > combination of these. The counter rows can be sharded if necessary.) > This is simply not an acceptable alternative and just can't be called handling it "well". It is equivalent to "make the users do it", which is the case for almost anything. The reasons #1072 is so valuable: 1) Does not require _any_ user action. 2) Does not change the EC-centric model of Cassandra. 3) Meets the requirements of many, major users who would otherwise have to use another storage system. > The second is that the approach in 1072 resembles an entirely separate > system that happens to use part of Cassandra infrastructure -- the > thrift API, the MessagingService, the sstable format -- but isn't > really part of it. ConsistencyLevel is not respected, and special > cases abound to weld things in that don't fit, e.g. the AES/Streaming > business. > Then let's find ways to make it as elegant as it can be. Ultimately, this functionality needs to be in Cassandra or users will simply migrate someplace else for this extremely common use case. b
Re: cassandra increment counters, Jira #1072
On Fri, Aug 13, 2010 at 6:24 AM, Jonathan Ellis wrote: >> >> This is simply not an acceptable alternative and just can't be called >> handling it "well". > > What part is it handling poorly, at a technical level? This is almost > exactly what 1072 does internally -- we are concerned here with the > high write, low read volume case. > Requiring clients directly manage the counter rows in order to periodically compress or segment them. Yes, you can emulate the behavior. No, that is not handling it well. >> It is equivalent to "make the users do it", which >> is the case for almost anything. > > I strongly feel we should be in the business of providing building > blocks, not special cases on top of that. (But see below, I *do* > think the 580 version vectors is the kind of building block we want!) > I agree, 580 is really valuable and should be in. The problem for high write rate, distributed counters is the requirement of read before write inherent in such vector-based approaches. Am I missing some aspect of 580 that precludes that? >> The reasons #1072 is so valuable: >> >> 1) Does not require _any_ user action. > > This can be addressed at the library level. Just as our first stab at > ZK integration was a rather clunky patch; "cages" is better. > Certainly, but it would be hard to argue (and I am not) that the tightly synchronized behavior of ZK is a good match for Cassandra (mixing in Paxos could make for some neat options, but that's another debate...). >> 2) Does not change the EC-centric model of Cassandra. > > It does, though. 1072 is *not* a version vector-based approach -- > that would be 580. Read the 1072 design doc, if you haven't. (Thanks > to Kelvin for writing that up!) > Nor is Cassandra right now. I know 1072 isn't vector based, and I think that is in its favor _for this application_. > I'm referring in particular to reads requiring CL.ALL. (My > understanding is that in the previous design, a "master" replica was > chosen and was always written to first.) Both of these break "the > EC-centric model" and that is precisely the objection I made when I > said "ConsistencyLevel is not respected." I don't think this is > fixable in the 1072 approach. I would be thrilled to be wrong. > It is EC in that the total for a counter is unknown until resolved on read. Yes, it does not respect CL, but since it can only be used in 1 way, I don't see that as a disadvantage. >>> The second is that the approach in 1072 resembles an entirely separate >>> system that happens to use part of Cassandra infrastructure -- the >>> thrift API, the MessagingService, the sstable format -- but isn't >>> really part of it. ConsistencyLevel is not respected, and special >>> cases abound to weld things in that don't fit, e.g. the AES/Streaming >>> business. >> >> Then let's find ways to make it as elegant as it can be. Ultimately, >> this functionality needs to be in Cassandra or users will simply >> migrate someplace else for this extremely common use case. > > This is what I've been pushing for. The version vector approach to > counting (i.e. 580 as opposed to 1072) is exactly the more elegant, > EC-centric approach that addresses a case that we *don't* currently > handle well (counters with a higher read volume than 1072). > Perhaps I missed something: does counting 580 require read before counter update (local to the node, not a client read)? b
Re: Locking in cassandra
This is the locking implementation: http://commons.apache.org/lang/api-2.4/org/apache/commons/lang/NotImplementedException.html And you might benefit from reading these: http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html http://www.slideshare.net/benjaminblack/introduction-to-cassandra-replication-and-consistency b On Mon, Aug 16, 2010 at 6:07 AM, Maifi Khan wrote: > Hi > How is the locking implemented in cassandra? Say, I have 10 nodes and > I want to write to 6 nodes which is (n+1)/2. > Will it lock 6 nodes first and then start writing? Or will it write > one by one and see if it can write to 6 nodes. > How is this implemented? What package does this locking? > Thanks in advance. > > thanks >
Re: Order preserving partitioning strategy
https://svn.apache.org/repos/asf/cassandra/trunk/src/java/org/apache/cassandra/dht/OrderPreservingPartitioner.java On Sun, Aug 22, 2010 at 10:46 AM, Hien. To Trong wrote: > Hi, > I am developing a system with some features like cassandra. > I want to add order preserving partitioning strategy, but I don't know how to > implement it. > > In cassandra paper - Cassandra - A Decentralized Structured Storage System > "Cassandra partitions data across the cluster using consistent hashing but > uses an order pre- > serving hash function (OPHF) to do so" > > I wonder: > > 1. Cassandra still use a hash function (the other strategy is random > partitioner) for OPP? > If so, what is the algorithm of OPHF? is it a type of minimal perfect hash > function (MPHF)? > > I already read some papers about algorithms for MPHF which preserve the order > of hash value. However, > the size of key space equals and hash value space are equal and much more > smaller than the size of key space > (may be userid or usertaskid) in our application. How can I deal with that or > I went on the wrong track? > > 2. My system is simple. I have some servers and I use Berkeley DB to store > Key/Value (our data model is simple). Is OPP strategy useful > when I don't have data model like cassandra? (column family for example). > > Thanks so much.
Re: Order preserving partitioning strategy
Use OPP and prefix keys with a randomizing element when range queries will not be required. For keys that will be queried in ranges, don't use such a prefix. On Mon, Aug 23, 2010 at 8:36 PM, Hien. To Trong wrote: > Hi, > OrderPreservingPartitioner is efficient range queries but can cause unevently > distributed data. > Does anyone has an idea of a HybridPartitioner which takes advantages of both > RandomPartitioner and OPP, > or at least a partitioner trade off between them.
Re: handling client network connections in Cassandra
Think he means on the server side, yo. On Wed, Sep 1, 2010 at 12:31 PM, Jonathan Ellis wrote: > You might want to build on top of something like Hector that handles > the low level pooling, failover, etc. already instead of raw Thrift. > > On Wed, Sep 1, 2010 at 11:04 AM, Amol Deshpande > wrote: >> As I've mentioned before, I'm looking at implementing a protobuf >> interface for clients to talk to Cassandra. Looking at the source, I >> don't see a network thread/connection pool that I could easily piggyback >> on. This is probably because both thrift and avro seem to have their own >> internal connection management. >> >> Any opinions on apache MINA ? >> >> Thanks, >> -amol >> > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com >
Re: dropped messages
Complete information, including everything in tpstats, is available for your monitoring systems via JMX. For production clusters, it is essential you at least collect the JMX stats, if not alarm on various problems (such as backed up stages). b On Wed, Sep 22, 2010 at 6:47 AM, Carl Bruecken wrote: > On 9/22/10 9:37 AM, Jonathan Ellis wrote: >> >> it's easy to tell from tpstats which stage(s) are overloaded >> >> On Wed, Sep 22, 2010 at 8:29 AM, Carl Bruecken >> wrote: >>> >>> With current implementation, it's impossible to tell from logs what the >>> message types (verb) were dropped. I read this was changed for spamming, >>> but I think the behavior should be configurable, either aggregate counts >>> of >>> dropped messages or log individual occurrences with the message verb. >>> >>> One suggestion is to pass message into >>> MessagingService.incrementDroppedMessages and have a configuration item >>> or >>> system property indicate the behavior. >>> >> >> > It's generally transient/bursty. By the time I get around to checking > tpstats the active/pending counts are all back to 0 and I have no record as > to what occured. >