Re: How safe is "nodetool move" in 1.2 ?

2014-04-16 Thread Richard Low
On 16 April 2014 05:08, Jonathan Lacefield wrote: > Assuming you have enough nodes not undergoing "move" to meet your CL > requirements, then yes, your cluster will still accept reads and writes. > However, it's always good to test this before doing it in production to > ensure your cluster and a

Re: Vnodes and replication

2014-04-08 Thread Richard Low
On 8 April 2014 09:29, vck wrote: > After reading through the vnodes and partitioning described in the > datastax documentation, I am still confused about how rows are > partitioned/replicated. > > With vnodes, I know that each Node on the ring now supports many token > ranges per Node. However

Re: Denial of Service Issue

2013-10-11 Thread Richard Low
On 11 October 2013 14:03, wrote: > I found the issue below concerning inactive client connections (see > *Cassandra > Security*). > We are using Cassandra 1.2.4 and the Cassandra JDBC driver as client. Is > this still an existing

Re: nodetool cfhistograms refresh

2013-10-01 Thread Richard Low
On 1 October 2013 16:21, Rene Kochen wrote: > Quick question. > > I am using Cassandra 1.0.11 > > When is nodetool cfhistograms output reset? I know that data is collected > during read requests. But I am wondering if it is data since the beginning > (start of Cassandra) or if it is reset periodi

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Richard Low
On 19 September 2013 20:36, Suruchi Deodhar < suruchi.deod...@generalsentiment.com> wrote: > Thanks for your replies. I wiped out my data from the cluster and also > cleared the commitlog before restarting it with num_tokens=256. I then > uploaded data using sstableloader. > > However, I am still

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Richard Low
The only thing you need to guarantee is that Cassandra doesn't start with num_tokens=1 (the default in 1.2.x) or, if it does, that you wipe all the data before starting it with higher num_tokens. On 19 September 2013 19:07, Robert Coli wrote: > On Thu, Sep 19, 2013 at 10:59 AM, Suruchi Deodhar

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Richard Low
ay even after Cassandra is restarted. > > Thanks, > Suruchi > > On Sep 19, 2013, at 3:46, Richard Low wrote: > > On 19 September 2013 02:06, Jayadev Jayaraman wrote: > > We use vnodes with num_tokens = 256 ( 256 tokens per node ) . After >> loading some data with sstabl

Re: Row size in cfstats vs cfhistograms

2013-09-19 Thread Richard Low
On 19 September 2013 10:31, Rene Kochen wrote: I use Cassandra 1.0.11 > > If I do cfstats for a particular column family, I see a "Compacted row > maximum size" of 43388628 > > However, when I do a cfhistograms I do not see such a big row in the Row > Size column. The biggest row there is 126934.

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Richard Low
On 19 September 2013 02:06, Jayadev Jayaraman wrote: We use vnodes with num_tokens = 256 ( 256 tokens per node ) . After loading > some data with sstableloader , we find that the cluster is heavily > imbalanced : > How did you select the tokens? Is this a brand new cluster which started on firs

Re: w00tw00t.at.ISC.SANS.DFind not found

2013-09-08 Thread Richard Low
On 8 September 2013 02:55, Tim Dunphy wrote: > Hey all, > > I'm seeing this exception in my cassandra logs: > > Exception during http request > mx4j.tools.adaptor.http.HttpException: file > mx4j/tools/adaptor/http/xsl/w00tw00t.at.ISC.SANS.DFind:) not found > at > mx4j.tools.adaptor.http.

Re: successful use of "shuffle"?

2013-09-02 Thread Richard Low
On 30 August 2013 18:42, Jeremiah D Jordan wrote: You need to introduce the new "vnode enabled" nodes in a new DC. Or you > will have similar issues to > https://issues.apache.org/jira/browse/CASSANDRA-5525 > > Add vnode DC: > > http://www.datastax.com/documentation/cassandra/1.2/webhelp/index.ht

Re: How many seed nodes should I use?

2013-08-29 Thread Richard Low
On 29 August 2013 01:55, Ike Walker wrote: > What is the best practice for how many seed nodes to have in a Cassandra > cluster? I remember reading a recommendation of 2 seeds per datacenter in > Datastax documentation for 0.7, but I'm interested to know what other > people are doing these days,

Re: token(), limit and wide rows

2013-08-17 Thread Richard Low
You can do it by using two types of query. One using token as you suggest, the other by fixing the partition key and walking through the other parts of the composite primary key. For example, consider the table: create table paging (a text, b text, c text primary key (a, b)); I inserted ('1', '

Re: Vnodes, adding a node ?

2013-08-14 Thread Richard Low
On 14 August 2013 20:02, Andrew Cobley wrote: > I have small test cluster of 2 nodes. I ran a stress test on it and with > nodetool status received the following: > > /usr/local/bin/apache-cassandra-2.0.0-rc1/log $ ../bin/nodetool status > Datacenter: datacenter1 > === > Sta

Re: cassandra 1.2.5- virtual nodes (num_token) pros/cons?

2013-08-13 Thread Richard Low
On 13 August 2013 10:15, Alain RODRIGUEZ wrote: Streaming from all the physical nodes in the cluster should make repair > faster, for the same reason it makes bootstrap faster. Shouldn't it ? > Virtual nodes doesn't speed up either very much. Repair and bootstrap will be limited by the node doi

Re: clarification of token() in CQL3

2013-08-06 Thread Richard Low
On 6 August 2013 16:56, Keith Freeman <8fo...@gmail.com> wrote: Your description makes me think that if new rows are added during the > paging (i.e. between one select with token()'s and another), they might > show up in the query results, right? (because the hash of the new row keys > might fal

Re: clarification of token() in CQL3

2013-08-06 Thread Richard Low
On 6 August 2013 15:12, Keith Freeman <8fo...@gmail.com> wrote: > I've seen in several places the advice to use queries like to this page > through lots of rows: > > > select id from mytable where token(id) > token(last_id) > > > But it's hard to find detailed information about how this works (at

Re: cassandra 1.2.5- virtual nodes (num_token) pros/cons?

2013-08-06 Thread Richard Low
On 6 August 2013 08:40, Aaron Morton wrote: > The reason for me looking at virtual nodes is because of terrible > experiences we had with 0.8 repairs and as per documentation (an logically) > the virtual nodes seems like it will help repairs being smoother. Is this > true? > > I've not thought to

Re: Counters and replication

2013-08-05 Thread Richard Low
On 5 August 2013 20:04, Christopher Wirt wrote: > Hello, > > ** ** > > Question about counters, replication and the ReplicateOnWriteStage > > ** ** > > I’ve recently turned on a new CF which uses a counter column. > > ** ** > > We have a three DC setup running Cassandra 1.2.4 with vN

Re: Reducing the number of vnodes

2013-08-05 Thread Richard Low
On 5 August 2013 12:30, Christopher Wirt wrote: I’m thinking about reducing the number of vnodes per server. > > ** ** > > We have 3 DC setup – one with 9 nodes, two with 3 nodes each. > > ** ** > > Each node has 256 vnodes. We’ve found that repair operations are beginning > to take too l

Re: nodetool cfstats write count ?

2013-07-29 Thread Richard Low
On 29 July 2013 14:43, Langston, Jim wrote: Running nodetool and looking at the cfstats output, for the > counters such as write count and read count, do those numbers > reflect any replication ? > > For instance, if write count shows 3000 and the replication factor > is 3, is that really 1000

Re: Installing Debian package from ASF repo

2013-07-29 Thread Richard Low
On 29 July 2013 12:00, Pavel Kirienko wrote: > Hi, > > I failed to install the Debian package of Cassandra 1.2.7 from ASF > repository because of 404 error. > APT said: > > http://www.apache.org/dist/cassandra/debian/pool/main/c/cassandra/cassandra_1.2.7_all.deb > 404 Not Found [IP: 192.87.106.

Re: Cassandra and RAIDs

2013-07-24 Thread Richard Low
On 24 July 2013 15:36, Jan Algermissen wrote: > is it recommended to set up Cassandra using 'RAID-ed' disks for per-node > reliability or do people usually just rely on having the multiple nodes > anyway - why bother with replicated disks? > It's not necessary, due to replication as you say. Y

Re: [deletion in the future]

2013-07-20 Thread Richard Low
On 20 July 2013 15:16, Alexis Rodríguez wrote: > That's exactly what is happening with my row, but not what I was trying to > do. It seems that I misunderstood the stackoverflow post. I was trying to > schedule a delete for an entire row, is using ttl for columns the only way? > Yes, there's no

Re: [deletion in the future]

2013-07-20 Thread Richard Low
On 19 July 2013 23:31, Alexis Rodríguez wrote: > Hi guys, > > I've read here [1] that you can make a deletion mutation "for" the future. > That mechanism operates as a schedule for deletions according to the > stackoverflow post. But, I've been having problems to make it work with > my thrift c++

Re: Cassandra with vnode and ByteOrderedPartition

2013-07-03 Thread Richard Low
On 3 July 2013 22:18, Sávio Teles wrote: > We were able to implement ByteOrderedPartition on Cassandra 1.1 and > insert an object in a specific machine. > > However, with Cassandra 1.2 and VNodes we can't implement VNode with > ByteOrderedPartitioner > to insert an object in a specific machine.

Re: Cassandra with vnode and ByteOrderedPartition

2013-07-03 Thread Richard Low
On 3 July 2013 21:04, Sávio Teles wrote: We're using ByteOrderedPartition to programmatically choose the machine > which a objet will be inserted.* > > *How can I use *ByteOrderedPartition *with vnode on Cassandra 1.2? > Don't. Managing tokens with ByteOrderedPartitioner is very hard anyway, bu

Re: [Cassandra] Expanding a Cassandra cluster

2013-06-18 Thread Richard Low
On 10 June 2013 22:00, Emalayan Vairavanathan wrote: b) Will Cassandra automatically take care of removing > obsolete keys in future ? > In a future version Cassandra should automatically clean up for you: https://issues.apache.org/jira/browse/CASSANDRA-5051 Right now thoug

Re: Why so many vnodes?

2013-06-11 Thread Richard Low
On 11 June 2013 09:54, Theo Hultberg wrote: But in the paragraph just before Richard said that finding the node that > owns a token becomes slower on large clusters with lots of token ranges, so > increasing it further seems contradictory. > I do mean increase for larger clusters, but I guess it

Re: Why so many vnodes?

2013-06-10 Thread Richard Low
Hi Theo, The number (let's call it T and the number of nodes N) 256 was chosen to give good load balancing for random token assignments for most cluster sizes. For small T, a random choice of initial tokens will in most cases give a poor distribution of data. The larger T is, the closer to unifo

Re: cassandra-shuffle time to completion and required disk space

2013-05-01 Thread Richard Low
Hi John, > - Each machine needed enough free diskspace to potentially hold the entire cluster's sstables on disk I wrote a possible explanation for why Cassandra is trying to use too much space on your ticket: https://issues.apache.org/jira/browse/CASSANDRA-5525 if you could provide the informa

Re: Problems with shuffle

2013-04-15 Thread Richard Low
On 14 April 2013 00:56, Rustam Aliyev wrote: > Just a followup on this issue. Due to the cost of shuffle, we decided > not to do it. Recently, we added new node and ended up in not well balanced > cluster: > > Datacenter: datacenter1 > === > Status=Up/Down > |/ State=Normal/L

Re: Virtual Nodes, lots of physical nodes and potentially increasing outage count?

2012-12-12 Thread Richard Low
ot be > comfortable with a 144+ disk RAID 6 array, no matter the rebuild speed :) > > Is it possible to configure or write a snitch that would create separate > distribution zones within the cluster? (e.g. 144 nodes in cluster, split > into 12 zones. Data stored to node 1 could

Re: Why Secondary indexes is so slowly by my test?

2012-12-11 Thread Richard Low
and 12s. > When 'X' up to 13,the time:2.3s and 33s. > When 'X' up to 25,the time:3.8s and 53s. > > According to this, when 'X' up to billon, what's the result? Can Secondary > indexes be used in product? I hope it's my mistake in doing this test.Can > anyone give some tips about it? > Thanks in advance. > fancy > -- Richard Low Acunu | http://www.acunu.com | @acunu

Re: Vnode migration path

2012-12-11 Thread Richard Low
t, > mike > > 'Like' us on Facebook for exclusive content and other resources on all > Barracuda Networks solutions. > > Visit http://barracudanetworks.com/facebook > > > > > -- Richard Low Acunu | http://www.acunu.com | @acunu

Re: Virtual Nodes, lots of physical nodes and potentially increasing outage count?

2012-12-11 Thread Richard Low
This is because the for a given range of tokens quorum >> will be impossible, but quorum will be possible for others. >> >> In a vnode world if any two nodes are down, then the intersection of >> vnode token ranges they have are unavailable. >> >> I think it

Re: Virtual Nodes, lots of physical nodes and potentially increasing outage count?

2012-12-10 Thread Richard Low
>>> say into the triple digits) that the probability of at least one failure to >>> Quorum read/write occurring in a given time period would *increase*. >>> >>> Would this hold true, at least until physical nodes becomes greater than >>> num_tokens pe

Re: Hector counter question

2012-03-20 Thread Richard Low
afe.  But reading and incrementing is unsafe. Richard. -- Richard Low Acunu | http://www.acunu.com | @acunu

Re: Doubts related to composite type column names/values

2011-12-20 Thread Richard Low
ecify the type for each column, so they can be different. There is extra storage overhead for this and care must be taken to ensure all column names remain comparable. -- Richard Low Acunu | http://www.acunu.com | @acunu

Re: GC for ParNew on 0.8.6

2011-10-07 Thread Richard Low
upgraded and not run into this? What do you mean by the cluster has been weird since the upgrade? Have you noticed slow-downs? Any other messages in the logs that have appeared since the upgrade? Richard. -- Richard Low Acunu | http://www.acunu.com | @acunu

Re: Cassandra 0.8 Counters Inverted Index?

2011-10-03 Thread Richard Low
or slightly out of date. Richard. -- Richard Low Acunu | http://www.acunu.com | @acunu

Re: 15 seconds to increment 17k keys?

2011-09-02 Thread Richard Low
ption of reads with low consistency levels and read_repair_chance < 1.0). Note also that there is just one read per counter increment, not a read per replica. -- Richard Low Acunu | http://www.acunu.com | @acunu

Re: 15 seconds to increment 17k keys?

2011-09-01 Thread Richard Low
per second is about right, although you can probably do some tuning to improve this. I've also found that the pycassa client uses significant amounts of CPU, so be careful you are not CPU bound on the client. -- Richard Low Acunu | http://www.acunu.com | @acunu On Thu, Sep 1, 2011 at 2:

Re: hw requirements

2011-08-29 Thread Richard Low
with 1-8 TB of storage, but there are cases where bigger or smaller makes sense. Don't overspec your nodes - you'll be better off with more smaller nodes. You can use SSDs if you need the random read rate, and SATA drives are fine too. -- Richard Low Acunu | http://www.acunu.com | @acu

Pre-CassandraSF Happy Hour on Sunday

2011-07-08 Thread Richard Low
assandrasf-happyhour.eventbrite.com/ Hope you can join us! -- Richard Low Acunu | http://www.acunu.com | @acunu

Re: deduct token values for BOP

2011-07-07 Thread Richard Low
On Thu, Jul 7, 2011 at 3:39 PM, A J wrote: > Thanks. The above works. > But when I try to use the binary values rather than the hex values, it > does not work. i.e. instead of using 64ff, I use 01100100. Instead of > 6Dff, I use 01101101. > When using the binary values, everything (strings startin

Re: deduct token values for BOP

2011-07-06 Thread Richard Low
ery string that starts with a-d with only characters a-z afterwards will go to N1. Richard. -- Richard Low Acunu | http://www.acunu.com | @acunu

Re: issue with querying SuperColumn

2011-06-21 Thread Richard Low
You have key validation class UTF8Type for the standard CF, but BytesType for the super. This is why the key is "1" for standard, but printed as "31" for super, which is the hex ascii code for 1. In your java code, use "1".getBytes() as your key and it should wo

Re: problem in using get_range() function

2011-06-13 Thread Richard Low
If you set key_start and key_finish to empty strings then you will get all rows. You can do this with either RP or OPP, although with RP they will not be returned in order. -- Richard Low Acunu | http://www.acunu.com | @acunu On Mon, Jun 13, 2011 at 2:31 PM, Amrita Jayakumar wrote: > ca

Re: Retrieving a column from a fat row vs retrieving a single row

2011-06-09 Thread Richard Low
2011/6/9 Héctor Izquierdo Seliva : > Yeah, but if I have RF=3 then there are three nodes that can answer the > request right? Yes, if you're happy to read ConsistencyLevel.ONE.

Re: Retrieving a column from a fat row vs retrieving a single row

2011-06-09 Thread Richard Low
Remember also that partitioning is done by rows, not columns. So large rows are stored on a single host. This means they can't be load balanced and also all requests to that row will hit one host. Having separate rows will allow load balancing of I/Os. -- Richard Low Acunu |

Re: Misc Performance Questions

2011-06-08 Thread Richard Low
more seeks to seek over unwanted data. It will also help buffer caching to separate them - the small SSTables are more likely to remain in cache. -- Richard Low Acunu | http://www.acunu.com | @acunu

Re: Misc Performance Questions

2011-06-08 Thread Richard Low
be slower. > Maybe perf related:  Will there be a problem having multiple keyspaces on a > cluster all with different replication factors, from 1-3? No. Richard. -- Richard Low Acunu | http://www.acunu.com | @acunu