Re: sstables remain after compaction

2015-03-03 Thread Jason Wee
noted Tyler...and many thanks.. well, I read cassandra jira issues and just followed one of the comment https://issues.apache.org/jira/browse/CASSANDRA-5740 In general, I thought we always advised to upgrade through the 'major' revs, 1.0 -> 1.1 -> 1.2. Or, at least, I think that's the advice now

Re: Composite Keys in cassandra 1.2

2015-03-03 Thread Kai Wang
This is a tough one. One thing I can think of is to use Spark/Spark SQL to run ad-hoc queries on C* cluster. You can post on "Spark Cassandra Connector" user group. On Tue, Mar 3, 2015 at 10:18 AM, Yulian Oifa wrote: > Hello > Initially problem is that customer wants to have an option for ANY qu

Input/Output Error

2015-03-03 Thread 曹志富
Some times My C* 2.1.3 cluster compaction or streaming occur this error ,do this because of disk or filesystem problem?? Thanks All. -- Ranger Tsao

Re: sstables remain after compaction

2015-03-03 Thread Robert Coli
On Tue, Mar 3, 2015 at 2:34 PM, Tyler Hobbs wrote: > I'm not aware of any good reason to put 1.1.0 in the middle there. I > would go straight from 1.0.12 to the latest 1.1.x. > +1 =Rob

Re: Documentation of batch statements

2015-03-03 Thread Peter Lin
I agree with jonathan haddad. A traditional ACID transaction following the classic definition, isolation is necessary. Having said that, there is different levels of isolation. http://en.wikipedia.org/wiki/Isolation_%28database_systems%29#Isolation_levels Saying the distinction is pendantic is wr

Re: Documentation of batch statements

2015-03-03 Thread Jonathan Haddad
This is often a confusing topic because someone came up with the term ACID, which lists isolation as well as atomicity, and as a result most people assume they are independent. This is incorrect. For something to be atomic, it actually requires isolation. "An operation is atomic if no intermedia

Re: sstables remain after compaction

2015-03-03 Thread Tyler Hobbs
On Tue, Mar 3, 2015 at 3:44 AM, Jason Wee wrote: > we are in the midst of upgrading... 1.0.8 -> 1.0.12 then to 1.1.0.. then > to the latest of 1.1.. then to 1.2 I'm not aware of any good reason to put 1.1.0 in the middle there. I would go straight from 1.0.12 to the latest 1.1.x. -- Tyler H

Re: Documentation of batch statements

2015-03-03 Thread Tyler Hobbs
On Tue, Mar 3, 2015 at 2:39 PM, Jonathan Haddad wrote: > Actually, that's not true either. It's technically possible for a batch > to be partially applied in the current implementation, even with logged > batches. "atomic" is used incorrectly here, imo, since more than 2 states > can be visible

Re: Issue restarting cassandra with a cluster running Cassandra 1.2.x and Cassandra 2.0.x

2015-03-03 Thread Tobias Hauth
I would recommend against 2.0.12 as long as nodetool cleanup is broken and wait for 2.0.13. On Tue, Mar 3, 2015 at 11:43 AM, Nate McCall wrote: > Did you run 'upgrade sstables'? See these two sections in 2.0's NEWS.txt: > https://github.com/apache/cassandra/blob/cassandra-2.0/NEWS.txt#L132-L141

Re: Documentation of batch statements

2015-03-03 Thread Jonathan Haddad
Actually, that's not true either. It's technically possible for a batch to be partially applied in the current implementation, even with logged batches. "atomic" is used incorrectly here, imo, since more than 2 states can be visible, unapplied & applied. On Tue, Mar 3, 2015 at 9:26 AM Michael Dy

Re: Issue restarting cassandra with a cluster running Cassandra 1.2.x and Cassandra 2.0.x

2015-03-03 Thread Nate McCall
Did you run 'upgrade sstables'? See these two sections in 2.0's NEWS.txt: https://github.com/apache/cassandra/blob/cassandra-2.0/NEWS.txt#L132-L141 https://github.com/apache/cassandra/blob/cassandra-2.0/NEWS.txt#L195-L198 It's a good idea to move up to 2.0.12 while your at it. There have been a nu

Re: Turning on internal security with no downtime

2015-03-03 Thread Sam Tunnicliffe
If you're able to configure your clients so that they don't send requests to 1 node in the cluster you can enable PasswordAuthenticator & CassandraAuthorizer on that node only and use cqlsh to setup all your users & permissions. The rest of the cluster will continue to serve client requests as norm

Issue restarting cassandra with a cluster running Cassandra 1.2.x and Cassandra 2.0.x

2015-03-03 Thread Fabrice Facorat
Hi, we have a 52 Cassandra nodes cluster running Apache Cassandra 1.2.13. As we are planning to migrate to Cassandra 2.0.10, we decide to do some tests and we noticed that once a node in the cluster have been upgraded to Cassandra 2.0.x, restarting a Cassandra 1.2.x will fail. The tests were done

Re: Reboot: Read After Write Inconsistent Even On A One Node Cluster

2015-03-03 Thread Dan Kinder
Per Aleksey Yeschenko's comment on that ticket, it does seem like a timestamp granularity issue, but it should work properly if it is within the same session. gocql by default uses 2 connections and 128 streams per connection. If you set it to 1 connection with 1 stream this problem goes away. I su

Documentation of batch statements

2015-03-03 Thread Michael Dykman
I have a minor complaint about the documentation. On the page for Batch Statements: http://www.datastax.com/documentation/cql/3.0/cql/cql_reference/batch_r.html It states: "In the context of a Cassandra batch operation, atomic means that if any of the batch succeeds, all of it will." While the

Re: best practices for time-series data with massive amounts of records

2015-03-03 Thread mck
> Here "partition" is a random digit from 0 to (N*M) > where N=nodes in cluster, and M=arbitrary number. Hopefully it was obvious, but here (unless you've got hot partitions), you don't need N. ~mck

Re: Composite Keys in cassandra 1.2

2015-03-03 Thread Yulian Oifa
Hello Initially problem is that customer wants to have an option for ANY query , which does not fits good with NOSQL.However the size of data is too big for Relational DB. There are no typical queries on the data, there are 10 fields , based on which ( any mix of them also ) queries should be made.

Re: best practices for time-series data with massive amounts of records

2015-03-03 Thread Yulian Oifa
Hello You can use timeuuid as raw key and create sepate CF to be used for indexing Indexing CF may be either with user_id as key , or a better approach is to partition row by timestamp. In case of partition you can create compound key , in which you will store user_id and timestamp base ( for examp

Re: RDD partitions per executor in Cassandra Spark Connector

2015-03-03 Thread Carl Yeksigian
These questions would be better addressed to the Spark Cassandra Connector mailing list, which can be found here: https://github.com/datastax/spark-cassandra-connector/#community Thanks, Carl On Tue, Mar 3, 2015 at 4:42 AM, Pavel Velikhov wrote: > Hi, is there a paper or a document where one ca

Re: best practices for time-series data with massive amounts of records

2015-03-03 Thread mck
Clint, > CREATE TABLE events ( > id text, > date text, // Could also use year+month here or year+week or something else > event_time timestamp, > event blob, > PRIMARY KEY ((id, date), event_time)) > WITH CLUSTERING ORDER BY (event_time DESC); > > The downside of this approach is that w

Re: best practices for time-series data with massive amounts of records

2015-03-03 Thread Jack Krupansky
I'd recommend using 100K and 10M as rough guidelines for the maximum number of rows and bytes in a single partition. Sure, Cassandra can technically handle a lot more than that, but very large partitions can make your life more difficult. Of course you will have to do a POC to validate the sweet sp

Re: best practices for time-series data with massive amounts of records

2015-03-03 Thread Jens Rantil
Hi, I have not done something similar, however I have some comments: On Mon, Mar 2, 2015 at 8:47 PM, Clint Kelly wrote: > The downside of this approach is that we can no longer do a simple > continuous scan to get all of the events for a given user. > Sure, but would you really do that real ti

Strange Sizes after 2.1.3 upgrade

2015-03-03 Thread Jan Kesten
Hi, I found something strange this morning on our secondary cluster. I upgraded to 2.1.3 - hoping for incremental repairs to work - recently and this morning OpsCenter showed me disk usages to be very unequal. Most irritating is that some nodes show data sizes of > 3TB on one node, but they h

Re: sstables remain after compaction

2015-03-03 Thread Jason Wee
off topic for this discussion, and yea, we are in the midst of upgrading... 1.0.8 -> 1.0.12 then to 1.1.0.. then to the latest of 1.1.. then to 1.2. keep my finger cross for safe upgrading for such a big cluster... we hope that with cassandra moving some components off heap in 1.1 and 1.2, the clus

Re: RDD partitions per executor in Cassandra Spark Connector

2015-03-03 Thread Pavel Velikhov
Hi, is there a paper or a document where one can read how Spark reads Cassandra data in parallel? And how it writes data back from RDDs? Its a bit hard to have a clear picture in mind. Thank you, Pavel Velikhov > On Mar 3, 2015, at 1:08 AM, Rumph, Frens Jan wrote: > > Hi all, > > I didn't fi

Does DateTieredCompactionStrategy work with a compound clustering key?

2015-03-03 Thread Thomas Borg Salling
Does DateTieredCompactionStrategy in Apache Cassandra 2.1.2. work with a compound clustering key? More specifically would it work for a table like this where (timestamp, hash) makes up a compound clustering key: CREATE TABLE sensordata ( timeblock int, timestamp timestamp, hash int,