Cassandra CDC updating

2019-01-10 Thread Hao Zhang
Hi All I enabled CDC through yaml. When I insert 100k small rows, I don't see CDC file being created or updated unless I restart cassandra service after each update. However, when I insert rows with columns of 1MB, I started to see CDC files added. I looked at commit log, they are updated in a tim

RE: Cassandra CDC

2018-02-06 Thread Rahul Singh
t; > Many Thanks > Nigel > > From: Nigel LEACH > Sent: 06 February 2018 14:25 > To: user@cassandra.apache.org > Subject: RE: Cassandra CDC > > Thanks Rahul, I looked at the smartcat implementation, and am doing something > very similar. Unfortunately, I’m using a mixe

RE: Cassandra CDC

2018-02-06 Thread Nigel LEACH
Not too much delving needed, I upgraded jamm to v0.3.2. I’m not entirely sure why this was required, it seems a little obscure, but I’m back on track. Many Thanks Nigel From: Nigel LEACH Sent: 06 February 2018 14:25 To: user@cassandra.apache.org Subject: RE: Cassandra CDC Thanks Rahul, I

RE: Cassandra CDC

2018-02-06 Thread Nigel LEACH
just have to delve a bit more deeply. Regards Nigel From: Rahul Singh [mailto:rahul.xavier.si...@gmail.com] Sent: 06 February 2018 14:14 To: user@cassandra.apache.org Subject: Re: Cassandra CDC Nigel, Are you using something like this or rolled your own? https://github.com/smartcat-labs

Re: Cassandra CDC

2018-02-06 Thread Rahul Singh
Nigel, Are you using something like this or rolled your own? https://github.com/smartcat-labs/cassandra-kafka-connector/tree/master/cassandra-cdc Ive used it in a docker composition and it seemed to work fine for me. https://github.com/smartcat-labs/cassandra-kafka-connector/blob/master

Cassandra CDC

2018-02-06 Thread Nigel LEACH
Hello, I'm loading Cassandra (v3.10.0.1652) data into a Kafka (v1.0.0) topic via CDC and the org.apache.cassandra.db.commitlog.CommitLogReader. All seems to fit together, but I am seeing an "Invalid partitioner RandomPartitioner" error thrown. Is CDC compatible with the RandomPartitioner? There

Ask for suggestions to de-duplicate data for Cassandra CDC

2017-06-20 Thread Jay Zhuang
Hi, For Cassandra CDC feature: http://cassandra.apache.org/doc/latest/operating/cdc.html The CDC data is duplicated RF number of times. Let's say replication factor is 3 in one DC, the same data will be sent out 3 times. One solution is adding another DC with RF=1, which will be only use