Thanks Josh for your response. I agree that it may cause the JVM unstable if that feature is enabled. (we may still want to try that out internally and do some perf/stress tests, will share more information if we have the results. If you have any suggestions on testing, please let me know.)

Another question, in your design doc: https://docs.google.com/document/d/1ZxCWYkeZTquxsvf5hdPc0fiUnUHna8POvgt6TIzML4Y/edit You mentioned: "We will be providing a cdc compactor using the CommitLogReader interface shortly w/a config file specifying CF’s to preserve that will take cdc_raw data and compact it to cdc_processed" Is this feature implemented? Feels like it's doing similar work as the "daemon". If the customer want to implement such "daemon" inside cassandra, they could just extend that class, or have a callback interface in cdc_compactor, right? Would you please share more thoughts on that?

If there's any CDC improvement(bug) in your mind, would you please create JIRA? maybe we could contribute to. Also we're back porting CDC feature to 3.0 internally, any suggestions would be appreciated.

Thanks,
Jay

On 2/10/17 5:39 AM, Josh McKenzie wrote:
The primary reason I avoided integrating a daemon into the Cassandra
process was the increase in heap pressure and further muddying of the
profile of heap usage. We've already seen that mixing read/write,
compaction, streaming, and repair in the same JVM causes a nasty mix of
allocation patterns that are pretty much impossible to optimize for, so
furthering that problem wasn't on my ToDo list.

Having a tool in-tree? Sure. But I'd strongly recommend against having it
be in-process.

On Thu, Feb 9, 2017 at 7:19 PM, Jay Zhuang <jay.zhu...@yahoo.com.invalid>
wrote:

No. It's going to have Cassandra to manage the CDC logs, instead of having
another daemon process to handle that.

Here is CDC design JIRA: CASSANDRA-8844. The pain point is to develop and
manage the daemon. If they're integrated, it's going to be easier to manage
and monitor that.

Thanks,
Jay


On 2/9/17 3:57 PM, Dikang Gu wrote:

Is it for testing purpose?

On Thu, Feb 9, 2017 at 3:54 PM, Jay Zhuang <jay.zhu...@yahoo.com.invalid>
wrote:

Hi,

To process the CDC commitLogs, it requires a separate Daemon process,
Carl
has a Daemon example here: CASSANDRA-11575.

Does it make sense to integrate it into Cassandra? So the user doesn't
have to manage another JVM on the same box. Then provide an ITrigger like
interface (https://github.com/apache/cassandra/blob/trunk/src/java/org
/apache/cassandra/triggers/ITrigger.java#L49) to process the data.

Or maybe provide an interface option to handle the CDC commitLog in
SegmentManager(https://github.com/apache/cassandra/blob/trun
k/src/java/org/apache/cassandra/db/commitlog/CommitLogSegmen
tManagerCDC.java#L68).

Any comments? If it make sense, I could create a JIRA for that.

Thanks,
Jay






Reply via email to