Re: Cassandra read throughput with little/no caching.

2013-01-03 Thread Tyler Hobbs
> > Your description above was much better :-) I'm more interested in docs for > the raw metrics provided in JMX. I don't think there are any good docs for what is exposed directly through JMX. Most of the OpsCenter metrics map closely to one exposed JMX item, so that's a start. Other than that

Re: Cassandra read throughput with little/no caching.

2013-01-02 Thread James Masson
On 02/01/13 16:18, Tyler Hobbs wrote: On Wed, Jan 2, 2013 at 5:28 AM, James Masson mailto:james.mas...@opigram.com>> wrote: > 1) Hector sends a request to some node in the cluster, which will act as the coordinator. 2) The coordinator then sends the actual read requests out to each of the (RF

Re: Cassandra read throughput with little/no caching.

2013-01-02 Thread Tyler Hobbs
On Wed, Jan 2, 2013 at 5:28 AM, James Masson wrote: > > thanks for clarifying this. So you're saying the difference between the > global Read Request latency in opscenter, and the column family specific > one is in the effort coordinating a validated read across multiple replicas? Yes. > Is th

Re: Cassandra read throughput with little/no caching.

2013-01-02 Thread James Masson
On 31/12/12 18:45, Tyler Hobbs wrote: On Mon, Dec 31, 2012 at 11:24 AM, James Masson mailto:james.mas...@opigram.com>> wrote: Well, it turns out the Read-Request Latency graph in Ops-Center is highly misleading. Using jconsole, the read-latency for the column family in question

Re: Cassandra read throughput with little/no caching.

2012-12-31 Thread Tyler Hobbs
On Mon, Dec 31, 2012 at 11:24 AM, James Masson wrote: > > Well, it turns out the Read-Request Latency graph in Ops-Center is highly > misleading. > > Using jconsole, the read-latency for the column family in question is > actually normally around 800 microseconds, punctuated by occasional big > sp

Re: Cassandra read throughput with little/no caching.

2012-12-31 Thread Keith Wright
Following up on this, I was hoping to get everyone's take on my use case for Cassandra and see if everyone agrees it can meet the requirements: I have a very tight SLA around get times. These are almost always single row fetches for 20-50 columns on a row that is likely under 200 columns. The req

Re: Cassandra read throughput with little/no caching.

2012-12-31 Thread James Masson
Well, it turns out the Read-Request Latency graph in Ops-Center is highly misleading. Using jconsole, the read-latency for the column family in question is actually normally around 800 microseconds, punctuated by occasional big spikes that drive up the averages. Towards the end of the batc

Re: Cassandra read throughput with little/no caching.

2012-12-31 Thread James Masson
Hi Yiming, I've had the chance to observe what happens to cassandra read response time over time. It starts out with fast 1ms reads, until the first compaction starts, then the CPUs are maxed out for a period, and read latency rises to 4ms. After compaction finishes, the system returns to 1

Re: Cassandra read throughput with little/no caching.

2012-12-28 Thread Yiming Sun
James, sorry I was out for a few days. Yes, if the row cache doesn't give a good hit rate then it should be disabled. Is there any chance to increase the VM configuration specs? I couldn't pinpoint in exactly which message you mentioned the VMs are 2GB mem and 2 cores, which is a bit meager. Al

Re: Cassandra read throughput with little/no caching.

2012-12-24 Thread James Masson
On 21/12/12 17:56, Yiming Sun wrote: James, you could experiment with Row cache, with off-heap JNA cache, and see if it helps. My own experience with row cache was not good, and the OS cache seemed to be most useful, but in my case, our data space was big, over 10TB. Your sequential access pa

Re: Cassandra read throughput with little/no caching.

2012-12-24 Thread James Masson
Hi Aaron, On 23/12/12 20:18, aaron morton wrote: First, the non helpful advice, I strongly suggest changing the data model so you do not have 100MB+ rows. They will make life harder. I don't think we have 100MB+ rows. Column families, yes - but not rows. Write request latency is about 900

Re: Cassandra read throughput with little/no caching.

2012-12-23 Thread aaron morton
First, the non helpful advice, I strongly suggest changing the data model so you do not have 100MB+ rows. They will make life harder. > Write request latency is about 900 microsecs, read request > latency > is about 4000 microsecs. > > 4 milliseconds to

Re: Cassandra read throughput with little/no caching.

2012-12-21 Thread Yiming Sun
James, you could experiment with Row cache, with off-heap JNA cache, and see if it helps. My own experience with row cache was not good, and the OS cache seemed to be most useful, but in my case, our data space was big, over 10TB. Your sequential access pattern certainly doesn't play well with LR

Re: Cassandra read throughput with little/no caching.

2012-12-21 Thread James Masson
On 21/12/12 16:27, Yiming Sun wrote: James, using RandomPartitioner, the order of the rows is random, so when you request these rows in "Sequential" order (sort by the date?), Cassandra is not reading them sequentially. Yes, I understand the "next" row to be retrieved in sequence is likely t

Re: Cassandra read throughput with little/no caching.

2012-12-21 Thread Yiming Sun
James, using RandomPartitioner, the order of the rows is random, so when you request these rows in "Sequential" order (sort by the date?), Cassandra is not reading them sequentially. The size of the data, 200Mb, 300Mb , and 40Mb, are these the size for each column? Or are these the total size of t

Re: Cassandra read throughput with little/no caching.

2012-12-21 Thread James Masson
Hi, thanks for the reply On 21/12/12 14:36, Yiming Sun wrote: I have a few questions for you, James, 1. how many nodes are in your Cassandra ring? 2 or 3 - depending on environment - it doesn't seem to make a difference to throughput very much. What is a 30 minute task on a 2 node environ

Re: Cassandra read throughput with little/no caching.

2012-12-21 Thread Yiming Sun
I have a few questions for you, James, 1. how many nodes are in your Cassandra ring? 2. what is the replication factor? 3. when you say sequentially, what do you mean? what Partitioner do you use? 4. how many columns per row? how much data per row? per column? 5. what client library do you use

Cassandra read throughput with little/no caching.

2012-12-21 Thread James Masson
Hi list-users, We have an application that has a relatively unusual access pattern in cassandra 1.1.6 Essentially we read an entire multi hundred megabyte column family sequentially (little chance of a cassandra cache hit), perform some operations on the data, and write the data back to ano