Would warnings about overlapping SStables explain high pending compactions?

2014-09-24 Thread Donald Smith
On one of our nodes we have lots of pending compactions (499).In the past we've seen pending compactions go up to 2400 and all the way back down again. Investigating, I saw warnings such as the following in the logs about overlapping SStables and about needing to run "nodetool scrub" on a ta

Re: node keeps dying

2014-09-24 Thread Prem Yadav
BTW, thanks Michael. I am surprised why I didn't search for Cassandra oom before. I got some good links that discuss that. Will try to optimize and see how it goes. On Wed, Sep 24, 2014 at 10:27 PM, Prem Yadav wrote: > Well its not the Linux OOM killer. The system is running with all default >

Re: node keeps dying

2014-09-24 Thread Prem Yadav
Well its not the Linux OOM killer. The system is running with all default settings. Total memory 7GB- Cassandra gets assigned 2GB 2 core processors. Two rings with 3 nodes in each ring. On Wed, Sep 24, 2014 at 9:53 PM, Michael Shuler wrote: > On 09/24/2014 11:32 AM, Prem Yadav wrote: > >> this

Re: node keeps dying

2014-09-24 Thread Michael Shuler
On 09/24/2014 11:32 AM, Prem Yadav wrote: this is an issue that has happened a few times. We are using DSE 4.0 I believe this is Apache Cassandra 2.0.5, which is better info for this list. One of the Cassandra nodes is detected as dead by the opscenter even though I can see the process is u

Re: Adjusting readahead for SSD disk seeks

2014-09-24 Thread Daniel Chia
Cassandra only reads a small part of each SSTable during normal operation (not compaction), in fact Datastax recommends lowering readahead - http://www.datastax.com/documentation/cassandra/2.1/cassandra/install/installRecommendSettings.html There are also blogposts where people have improved their

Re: Adjusting readahead for SSD disk seeks

2014-09-24 Thread DuyHai Doan
"does it typically have to read in the entire SStable into memory (assuming the bloom filter said yes)?" --> No, it would be perf killer. On the read path, after Bloom filter, Cassandra is using the "Partition Key Cache" to see if the partition it is looking for is present there. If yes, it get

Adjusting readahead for SSD disk seeks

2014-09-24 Thread Donald Smith
We're using cassandra as a key-value store; our values are small. So we're thinking we don't need much disk readahead (e.g., "blockdev -getra /dev/sda"). We're using SSDs. When cassandra does disk seeks to satisfy read requests does it typically have to read in the entire SStable into memory

Node Joining, Not Streaming

2014-09-24 Thread Gene Robichaux
I just added two nodes, one in DC-A and one in DC-B. The node in DC-A started and immediately started to stream files from its piers. The node in DC-B has been in the JOINING state for nearly 24 hours and I have not seen any streams started. This cluster is in-house spanning 2 DCs. Using DataSt

Re: How to get data which has changed within x minutes using CQL?

2014-09-24 Thread Tobias Widén
You could have two tables with the data inserted with TTL=5 minutes in one and TTL=15 minutes in the other. Just select everything from the appropriate table. Having separate tables for different TTLs is also a good practice for keeping your SSTables in good condition. From: DuyHai Doan mailto:

node keeps dying

2014-09-24 Thread Prem Yadav
Hi, this is an issue that has happened a few times. We are using DSE 4.0 One of the Cassandra nodes is detected as dead by the opscenter even though I can see the process is up. the logs show heap space error: INFO [RMI TCP Connection(18270)-172.31.49.189] 2014-09-24 08:31:05,340 StorageService.

Re: using dynamic cell names in CQL 3

2014-09-24 Thread DuyHai Doan
Dynamic thing in Thrift ≈ clustering columns in CQL Can you give more details about your data model ? On Wed, Sep 24, 2014 at 1:11 PM, shahab wrote: > Hi, > > I would like to define schema for a table where the column (cell) names > are defined dynamically. Apparently there is a way to do this

using dynamic cell names in CQL 3

2014-09-24 Thread shahab
Hi, I would like to define schema for a table where the column (cell) names are defined dynamically. Apparently there is a way to do this in Thrift ( http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows) but i couldn't find how i can do the same using CQL? Any resource/ex

Re: CPU consumption of Cassandra

2014-09-24 Thread DuyHai Doan
More details on the "composite partition key" and usage of "IN". I'm not sure whether your stress tests are using the IN clause with partition key, but this post is worth reading to avoid some caveats if you're planning to use composite partition keys in the future : http://getprismatic.com/story/

RE: CPU consumption of Cassandra

2014-09-24 Thread Leleu Eric
Thanks for all your comments, you help me a lot. My composite partition key is clearly the bottleneck (PRIMARY KEY ((owner, tenantid), name)) ) When I ran Cassandra-stress (on a dedicated server), the number of reads can’t go further than 10K/s (with idle to 20% / user to 72% and sys to 8%) If