Re: cassandra-shuffle time to completion and required disk space

2013-05-01 Thread Richard Low
Hi John, > - Each machine needed enough free diskspace to potentially hold the entire cluster's sstables on disk I wrote a possible explanation for why Cassandra is trying to use too much space on your ticket: https://issues.apache.org/jira/browse/CASSANDRA-5525 if you could provide the informa

Re: cassandra-shuffle time to completion and required disk space

2013-04-30 Thread aaron morton
> These are taken just before starting shuffle (ran repair/cleanup the day > before). > During shuffle disabled all reads/writes to the cluster. > > nodetool status keyspace: > > Load Tokens Owns (effective) Host ID > 80.95 GB 256 16.7% 754f9f4c-4ba7-4495-97e7-1f5b6755c

Re: cassandra-shuffle time to completion and required disk space

2013-04-29 Thread John Watson
That's what we tried first before the shuffle. And ran into the space issue. That's detailed in another thread title: "Adding nodes in 1.2 with vnodes requires huge disks" On Mon, Apr 29, 2013 at 4:08 AM, Sam Overton wrote: > An alternative to running shuffle is to do a rolling > bootstrap/dec

Re: cassandra-shuffle time to completion and required disk space

2013-04-29 Thread Sam Overton
An alternative to running shuffle is to do a rolling bootstrap/decommission. You would set num_tokens on the existing hosts (and restart them) so that they split their ranges, then bootstrap in N new hosts, then decommission the old ones. On 28 April 2013 22:21, John Watson wrote: > The amount

Re: cassandra-shuffle time to completion and required disk space

2013-04-28 Thread John Watson
11 nodes 1 keyspace 256 vnodes per node upgraded 1.1.9 to 1.2.3 a week ago These are taken just before starting shuffle (ran repair/cleanup the day before). During shuffle disabled all reads/writes to the cluster. nodetool status keyspace: Load Tokens Owns (effective) Host ID 80.95 GB

Re: cassandra-shuffle time to completion and required disk space

2013-04-28 Thread aaron morton
Can you provide some info on the number of nodes, node load, cluster load etc ? AFAIK shuffle was not an easy thing to test and does not get much real world use as only some people will run it and they (normally) use it once. Any info you can provide may help improve the process. Cheers -

cassandra-shuffle time to completion and required disk space

2013-04-28 Thread John Watson
The amount of time/space cassandra-shuffle requires when upgrading to using vnodes should really be apparent in documentation (when some is made). Only semi-noticeable remark about the exorbitant amount of time is a bullet point in: http://wiki.apache.org/cassandra/VirtualNodes/Balance "Shuffling