Re: Why so many vnodes?

2013-06-10 Thread Alain RODRIGUEZ
I think he actually meant *increase*, for this reason "For small T, a random choice of initial tokens will in most cases give a poor distribution of data. The larger T is, the closer to uniform the distribution will be, with increasing probability." Alain 2013/6/11 Theo Hultberg > thanks, tha

Re: Why so many vnodes?

2013-06-10 Thread Theo Hultberg
thanks, that makes sense, but I assume in your last sentence you mean decrease it for large clusters, not increase it? T# On Mon, Jun 10, 2013 at 11:02 PM, Richard Low wrote: > Hi Theo, > > The number (let's call it T and the number of nodes N) 256 was chosen to > give good load balancing for

Re: SSDs w/ C* only for commit log?

2013-06-10 Thread Manoj Mainali
You can refer to this conversation here http://comments.gmane.org/gmane.comp.db.cassandra.user/27366 Manoj On Tue, Jun 11, 2013 at 10:01 AM, Tanya Malik wrote: > If I understand the C* architecture correctly, in order to increase write > speed, I only need to put the commit log on SSDs. > > Whe

Coprosessors/Triggers in C*

2013-06-10 Thread Tanya Malik
Hi, Does C* support something like co-processor functionality/triggers to run client-supplied code in the address space of the server?

Re: Flushing column families individually in cassandra

2013-06-10 Thread Manoj Mainali
In the older versions it was possible, but, in C* 1.2 it is a global configuration so you won't be able to configure it per CF basis. Manoj On Tue, Jun 11, 2013 at 10:32 AM, Tanya Malik wrote: > Is it possible in C* 1.2 to configure column families to be flushed > individually? > > So, if I hav

Flushing column families individually in cassandra

2013-06-10 Thread Tanya Malik
Is it possible in C* 1.2 to configure column families to be flushed individually? So, if I have 3 CFs, and one of them gets full to 64 MB and the others are only at 5 MB, will only the first full CF get flushed to a SStable? Also, is it possible to configure different sized memtables for differen

SSDs w/ C* only for commit log?

2013-06-10 Thread Tanya Malik
If I understand the C* architecture correctly, in order to increase write speed, I only need to put the commit log on SSDs. When the memtable gets flushed to the SStable file later on, that can go to traditional spinning disks, since that happens much later after the write successful has already b

Re: [Cassandra] Expanding a Cassandra cluster

2013-06-10 Thread Robert Coli
On Mon, Jun 10, 2013 at 3:13 PM, Emalayan Vairavanathan wrote: > I suspect that nodetool cleanup is IO intensive. So running nodetool cleanup > concurrently on the entire cluster may have a significantly impact the IO > performance of applications. cleanup is a specific kind of compaction, and as

Re: Cassandra (1.2.5) + Pig (0.11.1) Errors with large column families

2013-06-10 Thread Franc Carter
-- Forwarded message -- From: "Mark Lewandowski" Date: Jun 8, 2013 8:03 AM Subject: Cassandra (1.2.5) + Pig (0.11.1) Errors with large column families To: Cc: > I'm cur.rently trying to get Cassandra (1.2.5) and Pig (0.11.1) to play nice together. I'm running a basic script: > >

Re: [Cassandra] Expanding a Cassandra cluster

2013-06-10 Thread Emalayan Vairavanathan
Thank you Edward. I suspect that nodetool cleanup is IO intensive. So running nodetool cleanup concurrently on the entire cluster may have a significantly impact the IO   performance of applications. Apart from this, do you see any other implications on running the nodetool cleanup concurrently

Re: [Cassandra] Expanding a Cassandra cluster

2013-06-10 Thread Edward Capriolo
You eventually should run cleanup to remove data no longer needed on the node. However it does not need to be run quickly after a join. You can run it when you get around to it. I would run it on a few nodes at a time until they are all cleaned up. On Mon, Jun 10, 2013 at 5:00 PM, Emalayan Vairav

Re: Why so many vnodes?

2013-06-10 Thread Richard Low
Hi Theo, The number (let's call it T and the number of nodes N) 256 was chosen to give good load balancing for random token assignments for most cluster sizes. For small T, a random choice of initial tokens will in most cases give a poor distribution of data. The larger T is, the closer to unifo

[Cassandra] Expanding a Cassandra cluster

2013-06-10 Thread Emalayan Vairavanathan
Hi All, Datastax manual suggests that during a Cassandra cluster expansion, an administrator has to run nodetool cleanup on each of the previously existing Cassandra nodes to remove the keys that are no longer belonging to those nodes. Further the manual says that thenodetool cleanup  task shou

Re: Data Loss/Missing With Cassandra

2013-06-10 Thread Nimi Wariboko Jr
Thanks for all the advice I really appreciate it. 1.) Seeing as how I only had a single node cluster previously, if I just `nodetool move 0` the original node that should be an easy fix then? 2.) Is my data not marked as deleted? If I use sstableloader to restream the data am I just reorganizi

Re: Data Loss/Missing With Cassandra

2013-06-10 Thread Edward Capriolo
This is true unless you have ran cleanup. If you have run 'nodetool cleanup' then there is data loss. On Mon, Jun 10, 2013 at 1:40 PM, Tyler Hobbs wrote: > > On Mon, Jun 10, 2013 at 11:39 AM, Nimi Wariboko Jr < > nimiwaribo...@gmail.com> wrote: > >> How can I recover that data? Can I assume the

Re: Why so many vnodes?

2013-06-10 Thread Theo Hultberg
I'm not sure I follow what you mean, or if I've misunderstood what Cassandra is telling me. Each node has 256 vnodes (or tokens, as the prefered name seems to be). When I run `nodetool status` each node is reported as having 256 vnodes, regardless of how many nodes are in the cluster. A single node

Re: Data Loss/Missing With Cassandra

2013-06-10 Thread Tyler Hobbs
On Mon, Jun 10, 2013 at 11:39 AM, Nimi Wariboko Jr wrote: > How can I recover that data? Can I assume they are still in the sstables? Yes > Would doing a sstable2json then reading and reinserting be an optimal > solution? You can just use the sstableloader utility to load them directly. It

Re: Data Loss/Missing With Cassandra

2013-06-10 Thread Edward Capriolo
To recover it would be to dump everything then re-insert everything. Another option would be to return all nodes to whatever tokens they were before the switches, since the old data is still there. Either way both recovery options are long,painful, and a good amount of manual steps. I would not w

RE: Unsubscribe?

2013-06-10 Thread Burns, Eric
I thought it was interesting that when I tried to subscribe to the developers list my e-mail was bounced for being spam. But the user subscription got through just fine. Could this be happening on the unsubscribe? I've seen "your message is spam" messages bounced by the recipient for looking

Re: Data Loss/Missing With Cassandra

2013-06-10 Thread Nimi Wariboko Jr
How can I recover that data? Can I assume they are still in the sstables? Would doing a sstable2json then reading and reinserting be an optimal solution? On Monday, June 10, 2013 at 9:18 AM, Tyler Hobbs wrote: > > On Sun, Jun 9, 2013 at 3:19 PM, Nimi Wariboko Jr (mailto:nimiwaribo...@gmail.co

Re: Unsubscribe?

2013-06-10 Thread Robert Wille
I unsubscribed a while ago and then resubscribed. It took about four unsubscribe attempts before it actually worked. From: "Fatih P." Reply-To: Date: Mon, 10 Jun 2013 17:46:30 +0300 To: Subject: Re: Unsubscribe? i tried the same and receiving mails. On Mon, Jun 10, 2013 at 5:34 PM, Luke

Re: Data Loss/Missing With Cassandra

2013-06-10 Thread Edward Capriolo
Although there is a ticket open that WILL start removing data automatically. Which I keep saying is a bad idea. On Mon, Jun 10, 2013 at 12:18 PM, Tyler Hobbs wrote: > > On Sun, Jun 9, 2013 at 3:19 PM, Nimi Wariboko Jr > wrote: > >> If I had to do a repair after upping the RF, than that is pro

Eclipse 4.3 BIRT

2013-06-10 Thread Radim Kolar
reading changelog for eclipse kepler (4.3) and BIRT has support for creating reports from Cassandra

Re: Data Loss/Missing With Cassandra

2013-06-10 Thread Tyler Hobbs
On Sun, Jun 9, 2013 at 3:19 PM, Nimi Wariboko Jr wrote: > If I had to do a repair after upping the RF, than that is probably what > caused the data loss. Wish I had been more careful. > > I'm guessing the data is irrevocably lost, I didn't make any any snapshots. > > Would it be possible to figure

Re: Unsubscribe?

2013-06-10 Thread Dave Brosius
You sent an email to user-unsubscr...@cassandra.apache.org from the email addressed used, and it didn't unsubscribe you? Did you get the 'are you sure' email? Did you check your spam folder? see http://cassandra.apache.org/ http://hadonejob.com/img/70907344.jpg On 06/10/2013 10:46 AM, Fati

Re: Why so many vnodes?

2013-06-10 Thread Milind Parikh
There are n vnodes regardless of the size of the physical cluster. Regards Milind On Jun 10, 2013 7:48 AM, "Theo Hultberg" wrote: > Hi, > > The default number of vnodes is 256, is there any significance in this > number? Since Cassandra's vnodes don't work like for example Riak's, where > there i

Why so many vnodes?

2013-06-10 Thread Theo Hultberg
Hi, The default number of vnodes is 256, is there any significance in this number? Since Cassandra's vnodes don't work like for example Riak's, where there is a fixed number of vnodes distributed evenly over the nodes, why so many? Even with a moderately sized cluster you get thousands of slices.

Re: Unsubscribe?

2013-06-10 Thread Fatih P.
i tried the same and receiving mails. On Mon, Jun 10, 2013 at 5:34 PM, Luke Hospadaruk wrote: > Hi, > I hate to be a clod, but I'd really like to unsubscribe from this list. > I've tried every permutation I can think of to do it "the right way", and > all of the styles in the help message. If

Unsubscribe?

2013-06-10 Thread Luke Hospadaruk
Hi, I hate to be a clod, but I'd really like to unsubscribe from this list. I've tried every permutation I can think of to do it "the right way", and all of the styles in the help message. If there's a moderator reading this could you please take me off the list? Thanks, Luke

Changing replication factor

2013-06-10 Thread Vegard Berget
Hi, If one increases the replication factor of a keyspace and then do a repair, how will this affect the performance of the affected nodes? Could we risk the nodes being (more or less) unresponsive while repair is going on?  The nodes I am speaking of contains ~100gb of data.  Also, some of the key

Re: headed to cassandra conference next week in San Fran?

2013-06-10 Thread Hiller, Dean
Sounds good, feel free to text me when you are there. Later, Dean On 6/7/13 3:38 PM, "Faraaz Sareshwala" wrote: >I'll be attending and will try and meet up with you :). I see your posts >often >on this list -- would love to pick your brain and learn more about what >you are >using cassandra fo

Re: CQL3 Driver, DESCRIBE

2013-06-10 Thread Joe Greenawalt
Yep, *MetaData stuff is what I was looking for thanks for the pointer. Joe On Sun, Jun 9, 2013 at 8:00 PM, Sylvain Lebresne wrote: > DESCRIBE is not a CQL3 query but a cqlsh specific command, so it won't > work outside of cqlsh. However, at least if you are using the datastax java > driver (I d