Cassandra - Storm

2015-04-02 Thread Vanessa Gligor
Hi all, Did anybody use Cassandra for the tuple storage in Storm? I have this scenario: I have a spout (getting messages from RabbitMQ) and I want to save all these messages in Cassandra using a bolt. What is the best choice regarding the connection to the DB? I have read about Hector API. I used

Re: Getting NoClassDefFoundError for com/datastax/spark/connector/mapper/ColumnMapper

2015-04-02 Thread Dave Brosius
This is what i meant by 'initial cause' Caused by: java.lang.ClassNotFoundException: com.datastax.spark.connector.mapper.ColumnMapper So it is in fact a classpath problem Here is the class in question https://github.com/datastax/spark-cassandra-connector/blob/master/spark-cassandra-connector

RE: Getting NoClassDefFoundError for com/datastax/spark/connector/mapper/ColumnMapper

2015-04-02 Thread Tiwari, Tarun
Sorry I was unable to reply for couple of days. I checked the error again and can’t see any other initial cause. Here is the full error that is coming. Exception in thread "main" java.lang.NoClassDefFoundError: com/datastax/spark/connector/mapper/ColumnMapper at ldCassandraTable.main(ld_

does DC_LOCAL require manually truncating system.paxos on failover?

2015-04-02 Thread Sean Bridges
We are using lightweight transactions, two datacenters and DC_LOCAL consistency level. There is a comment in CASSANDRA-5797, "This would require manually truncating system.paxos when failing over." Is that required? I don't see it documented anywhere else. Thanks, Sean https://issues.apache.

Re: COMMERCIAL:Re: Cross-datacenter requests taking a very long time.

2015-04-02 Thread daemeon reiydelle
You might want to see what quorum is configured? I meant to ask that. *...* *“Life should not be a journey to the grave with the intention of arriving safely in apretty and well preserved body, but rather to skid in broadside in a cloud of smoke,thoroughly used up, totally worn out, an

Re: COMMERCIAL:Re: Cross-datacenter requests taking a very long time.

2015-04-02 Thread Andrew Vant
On Mar 31, 2015, at 4:59 PM, daemeon reiydelle wrote: > What is your replication factor? NetworkTopologyStrategy with replfactor: 2 in each DC. Someone else asked about the endpoint snitch I'm using; it's set to GossipingPropertyFileSnitch. > Any idea how much data has to be processed under t

Re: Frequent timeout issues

2015-04-02 Thread daemeon reiydelle
To the poster, I am sorry to have taken this off topic. Looking forward to your reply regarding your default heap size, frequency of hard garbage collection, etc. In any case I am not convinced that heap size/garbage collection is a root cause of your issue, but it has been so frequently a problem

Re: Frequent timeout issues

2015-04-02 Thread Jonathan Haddad
@Daemeon you may want to read through https://issues.apache.org/jira/browse/CASSANDRA-8150, there are perfectly valid cases for heap > 16gb. On Thu, Apr 2, 2015 at 10:07 AM daemeon reiydelle wrote: > May not be relevant, but what is the "default" heap size you have > deployed. Should be no more

Re: Column value not getting updated

2015-04-02 Thread daemeon reiydelle
Interesting that you are finding excessive drift from public time servers. I only once saw that problem with AWS' time servers. To be conservative I sometimes recommend that clients spool up their own time server, but realize IT will also drift if the public time servers do! Somewhat different if i

Re: Frequent timeout issues

2015-04-02 Thread daemeon reiydelle
May not be relevant, but what is the "default" heap size you have deployed. Should be no more than 16gb (and be aware of the impacts of gc on that large size), suggest not smaller than 8-12gb. On Wed, Apr 1, 2015 at 11:28 AM, Anuj Wadehra wrote: > Are you writing multiple cf at same time? > Pl

Re: Cluster status instability

2015-04-02 Thread daemeon reiydelle
Do you happen to be using a tool like Nagios or Ganglia that are able to report utilization (CPU, Load, disk io, network)? There are plugins for both that will also notify you of (depending on whether you enabled the intermediate GC logging) about what is happening. On Thu, Apr 2, 2015 at 8:35 A

Re: Best practice: Multiple clusters vs multiple tables in a single cluster?

2015-04-02 Thread daemeon reiydelle
Jack did a superb job of explaining all of your issues, and his last sentence seems to fit your needs (and my experience) very well. The only other point I would add is to ascertain if the use patterns commend microservices to abstract from data locality, even if the initial deployment is a noop to

Re: Best practice: Multiple clusters vs multiple tables in a single cluster?

2015-04-02 Thread Jack Krupansky
Sounds very appropriate for your situation. Also... you have the option of creating separate data centers, so that one cluster can service multiple work loads, so you get the benefits of both worlds, but that would mean you need separate nodes for the different key spaces for your use case, so it

Re: Best practice: Multiple clusters vs multiple tables in a single cluster?

2015-04-02 Thread Ian Rose
Thanks for the input, folks! As a startup, we don't really have different dev teams / apps - everything is in service of "the product", so given these responses, I think putting both into the same cluster is the best idea. And if we want to split them out in the future we are still small enough t

Re: Cluster status instability

2015-04-02 Thread Jan
Marcin  ;  are all your nodes within the same Region   ?   If not in the same region,   what is the Snitch type that you are using   ?  Jan/ On Thursday, April 2, 2015 3:28 AM, Michal Michalski wrote: Hey Marcin, Are they actually going up and down repeatedly (flapping) or just dow

Re: Best practice: Multiple clusters vs multiple tables in a single cluster?

2015-04-02 Thread Carlos Rolo
Adding a new keyspace should be perfectly fine. Unless you have completely distinct workloads for the different keyspaces. Even so you can balanced some stuff at keyspace/table level. But I would go with a new keyspace not with a new cluster given the small size you say you have. Regards, Carlos

Re: Best practice: Multiple clusters vs multiple tables in a single cluster?

2015-04-02 Thread Jack Krupansky
There is an old saying in the software industry: The structure of a system follows from the structure of the organization that created it (Conway's Law). Seriously, the main, first question for your end is who owns the applications in terms of executive management, such that if management makes a d

Best practice: Multiple clusters vs multiple tables in a single cluster?

2015-04-02 Thread Ian Rose
Hi all - We currently have a single cassandra cluster that is dedicated to a relatively narrow purpose, with just 2 tables. Soon we will need cassandra for another, unrelated, system, and my debate is whether to just add the new tables to our existing cassandra cluster or whether to spin up an en

Re: Cluster status instability

2015-04-02 Thread Michal Michalski
Hey Marcin, Are they actually going up and down repeatedly (flapping) or just down and they never come back? There might be different reasons for flapping nodes, but to list what I have at the top of my head right now: 1. Network issues. I don't think it's your case, but you can read about the is

Cluster status instability

2015-04-02 Thread Marcin Pietraszek
Hi! We have 56 node cluster with C* 2.0.13 + CASSANDRA-9036 patch installed. Assume we have nodes A, B, C, D, E. On some irregular basis one of those nodes starts to report that subset of other nodes is in DN state although C* deamon on all nodes is running: A$ nodetool status UN B DN C DN D UN E

Re: Exception while running cassandra stress client

2015-04-02 Thread Abhinav Ranjan
Hi, We too got the same error. Use cassandra-stress shipped with cassandra 2.1.x to run the test like that. Regards Abhinav On 02-Apr-2015 11:44 am, "ankit tyagi" wrote: > Hi All, > > while running cassandra stress tool shipped with cassandra 2.0.4 version, > i am getting following error > > *.

Re: Multinode Cassandra and sstableloader

2015-04-02 Thread Serega Sheypak
So, sstableloader streams a portion of data stored in /var/lib/cassandra/data/keyspace/table catalog If we have 3 nodes and RF=3, then only 1/3 of data would be streamed to other cluster. Problem is solved. 2015-04-01 12:05 GMT+02:00 Alain RODRIGUEZ : > From Michael Laing - posted on the wrong t

Re: SSTable structure

2015-04-02 Thread Serega Sheypak
Thank you, great to know that. 2015-04-01 23:14 GMT+02:00 Bharatendra Boddu : > Hi Serega, > > Most of the content in the blog article is still relevant. After 1.2.5 > (ic), there are only three new versions (ja, jb, ka) for SSTable format. > Following are the changes in these versions. > >