Re: how-to scan a table using CQL3

2013-05-10 Thread Thorsten von Eicken
Thanks, this is interesting, but if I'm not mistaken, Astyanax uses CQL2. I'm trying to find a CQL3 solution on top the binary protocol. There has to be a way to do this in CQL3...? Thorsten On 5/10/2013 1:33 PM, Keith Wright wrote: What you are proposing should work and I started to impleme

how-to scan a table using CQL3

2013-05-10 Thread Thorsten von Eicken
What is the proper way to scan a table in CQL3 when using the random partitioner? Specifically, what is the proper way to *start* the scan? E.g., is it something like: select rowkey from my_table limit N; while some_row_is_returned do select rowkey from my_table where token(rowkey) > token(last_

Re: Looking for a good Ruby client

2012-08-01 Thread Thorsten von Eicken
Harry, we're in a similar situation and are starting to work out our own ruby client. The biggest issue is that it doesn't make much sense to build a higher level abstraction on anything other than CQL3, given where things are headed. At least this is our opinion. At the same time, CQL3 is just bar

Re: CQL3 and column slices

2012-07-31 Thread Thorsten von Eicken
In the absence of streaming, how does one retrieve a large result set in CQL3? E.g., to use the same example: > CREATE TABLE bug_test (a int, b int, c int, d int, e int, f text, PRIMARY > KEY (a, b, c, d, e) ); > with some data in it: > > SELECT * FROM bug_test; > > Results: > > a | b | c | d | e |

reading deleted rows is super-slow

2012-07-10 Thread Thorsten von Eicken
We're finding that reading deleted columns can be very slow and I'm trying to get confirmation for our analysis of what happens. We wrote lots of data eons ago into fairly large rows (up to 1MB). We recently read those rows and then deleted them. After this, we ran a verification-type pass that att

Re: Why so many SSTables?

2012-04-12 Thread Thorsten von Eicken
>From my experience I would strongly advise against leveled compaction for your use-case. But you should certainly test and see for yourself! I have ~1TB on a node with ~13GB of heap. I ended up with 30k SSTables. I raised the SSTable size to 100MB but that didn't prove to be sufficient and I did i

Re: new node gets no data

2012-03-15 Thread Thorsten von Eicken
thelastpickle.com > > On 16/03/2012, at 6:45 AM, Thorsten von Eicken wrote: > >> I added a second node to a single-node ring. RF=1. I can't get the new >> node to receive any data. Logs look fine. Here's what nodetool reports: >> >> # nodetool -h loca

new node gets no data

2012-03-15 Thread Thorsten von Eicken
I added a second node to a single-node ring. RF=1. I can't get the new node to receive any data. Logs look fine. Here's what nodetool reports: # nodetool -h localhost ring Address DC RackStatus State Load OwnsToken

Re: how to increase compaction rate?

2012-03-13 Thread Thorsten von Eicken
On 3/13/2012 4:13 PM, Viktor Jevdokimov wrote: > What we did to speedup this process to return all exhausted nodes into > normal state faster: > We have created a 6 temporary virtual single Cassandra nodes with 2 > CPU cores and 8GB RAM. > Stopped completely a compaction for CF on a production node

Re: how to increase compaction rate?

2012-03-13 Thread Thorsten von Eicken
On 3/12/2012 6:52 AM, Brandon Williams wrote: > On Mon, Mar 12, 2012 at 4:44 AM, aaron morton wrote: >> I don't understand why I >> don't get multiple concurrent compactions running, that's what would >> make the biggest performance difference. >> >> concurrent_compactors >> Controls how many conc

Re: how to increase compaction rate?

2012-03-11 Thread Thorsten von Eicken
On 3/11/2012 9:17 PM, Peter Schuller wrote: >> multithreaded_compaction: false > Set to true. I did try that. I didn't see it go any faster. The cpu load was lower, which I assumed meant fewer bytes/sec being compressed (SnappyCompressor). I didn't see multiple compactions in parallel. Nodetool com

how to increase compaction rate?

2012-03-11 Thread Thorsten von Eicken
I'm having difficulties with leveled compaction, it's not making fast enough progress. I'm on a quad-core box and it only does one compaction at a time. Cassandra version: 1.0.6. Here's nodetool compaction stats: # nodetool -h localhost compactionstats pending tasks: 2568 compaction type

recovering from network partition

2012-01-30 Thread Thorsten von Eicken
I'm trying to work through various failure modes to figure out the proper operating procedure and proper client coding practices. I'm a little unclear about what happens when a network partition gets repaired. Take the following scenario: - cluster with 5 nodes: A thru E; RF = 3; read_cf = 1; writ

Re: how to delete data with level compaction

2012-01-28 Thread Thorsten von Eicken
On 1/28/2012 9:34 AM, Peter Schuller wrote: >> I'm using level compaction and I have about 200GB compressed in my >> largest CFs. The disks are getting full. This is time-series data so I >> want to drop data that is a couple of months old. It's pretty easy for >> me to iterate through the relevant

how to delete data with level compaction

2012-01-27 Thread Thorsten von Eicken
I'm using level compaction and I have about 200GB compressed in my largest CFs. The disks are getting full. This is time-series data so I want to drop data that is a couple of months old. It's pretty easy for me to iterate through the relevant keys and delete the rows. But will that do anything? I

Re: ideal cluster size

2012-01-21 Thread Thorsten von Eicken
Good point. One thing I'm wondering about cassandra is what happens when there is a massive failure. For example, if 1/3 of the nodes go down or become unreachable. This could happen in EC2 if an AZ has a failure, or in a datacenter if a whole rack or UPS goes dark. I'm not so concerned about the t

Re: ideal cluster size

2012-01-21 Thread Thorsten von Eicken
s. This means that > you'll have same performance with less nodes, making > it far easier to manage. > > SSDs by themselves will give you an order of magnitude > improvement on I/O. > > > > On 1/19/2012 9:17 PM, Thorsten von Eicken wrote: > >

ideal cluster size

2012-01-19 Thread Thorsten von Eicken
We're embarking on a project where we estimate we will need on the order of 100 cassandra nodes. The data set is perfectly partitionable, meaning we have no queries that need to have access to all the data at once. We expect to run with RF=2 or =3. Is there some notion of ideal cluster size? Or per

Re: cassandra hit a wall: Too many open files (98567!)

2012-01-19 Thread Thorsten von Eicken
is built by you until open so many files? would you tell us? >> Thanks... >> >> >> On Sat, Jan 14, 2012 at 2:01 AM, Thorsten von Eicken >> mailto:t...@rightscale.com>> wrote: >> >> I'm running a single node cassandra 1.0.6 ser

cassandra hit a wall: Too many open files (98567!)

2012-01-13 Thread Thorsten von Eicken
I'm running a single node cassandra 1.0.6 server which hit a wall yesterday: ERROR [CompactionExecutor:2918] 2012-01-12 20:37:06,327 AbstractCassandraDaemon.java (line 133) Fatal exception in thread Thread[CompactionExecutor:2918,1,main] java.io.IOError: java.io.FileNotFoundException: /mnt/ebs/dat

read repair and column range queries

2011-11-30 Thread Thorsten von Eicken
Looking at the docs, I can't conclusively answer this question: Suppose I make this CQL query with consistency factor 1 and read-repair 100%: select 'a'..'z' from cf where key = 'xyz' limit 5; Suppose the node I connect to has the key and responds with (improvised syntax): ['a'->0, 'c'->2, 'e'->4,

trouble with deleted counter columns

2011-11-29 Thread Thorsten von Eicken
Running a single 1.0.3 node and using counter columns I have a problem. I have rows with ~200k counters. I deleted a number of such rows and now I can't put counters back in, or really, I can't query what I put back in. Example using the cli: [default@rslog_production] get req_word_freq['2024'

storage space and compaction speed

2011-11-19 Thread Thorsten von Eicken
I recently changed the default_validation_class on a bunch of CFs from BytesType to UTF8Type and I observed two things: first I saw a number of compactions during the migration that showed ~200% to ~400% of original in the log entry. Second, it seems that compaction speed has now halved. I'm using

Implementing locks using cassandra only

2010-09-13 Thread Thorsten von Eicken
I've been musing about how to implement locks using just cassandra, i.e. without sql db or zookeeper. I wrote up what I've come up with on the wiki (it's a bit too long for an email) at . I'm wondering whether I've overlooked something, especially I'm n