Could be done with CDC Could be done with triggers (Could be done with vtables — double writes or double reads — if they were extended to be user facing)
Would be very hard to generalize properly, especially handling failure cases (write succeeds in one cluster/table but not the other) which are often app specific -- Jeff Jirsa > On Oct 18, 2018, at 6:47 PM, Jonathan Ellis <jbel...@gmail.com> wrote: > > Isn't this what CDC was designed for? > > https://issues.apache.org/jira/browse/CASSANDRA-8844 > > On Thu, Oct 18, 2018 at 10:54 AM Carl Mueller > <carl.muel...@smartthings.com.invalid> wrote: > >> tl;dr: a generic trigger on TABLES that will mirror all writes to >> facilitate data migrations between clusters or systems. What is necessary >> to ensure full write mirroring/coherency? >> >> When cassandra clusters have several "apps" aka keyspaces serving >> applications colocated on them, but the app/keyspace bandwidth and size >> demands begin impacting other keyspaces/apps, then one strategy is to >> migrate the keyspace to its own dedicated cluster. >> >> With backups/sstableloading, this will entail a delay and therefore a >> "coherency" shortfall between the clusters. So typically one would employ a >> "double write, read once": >> >> - all updates are mirrored to both clusters >> - writes come from the current most coherent. >> >> Often two sstable loads are done: >> >> 1) first load >> 2) turn on double writes/write mirroring >> 3) a second load is done to finalize coherency >> 4) switch the app to point to the new cluster now that it is coherent >> >> The double writes and read is the sticking point. We could do it at the app >> layer, but if the app wasn't written with that, it is a lot of testing and >> customization specific to the framework. >> >> We could theoretically do some sort of proxying of the java-driver somehow, >> but all the async structures and complex interfaces/apis would be difficult >> to proxy. Maybe there is a lower level in the java-driver that is possible. >> This also would only apply to the java-driver, and not >> python/go/javascript/other drivers. >> >> Finally, I suppose we could do a trigger on the tables. It would be really >> nice if we could add to the cassandra toolbox the basics of a write >> mirroring trigger that could be activated "fairly easily"... now I know >> there are the complexities of inter-cluster access, and if we are even >> using cassandra as the target mirror system (for example there is an >> article on triggers write-mirroring to kafka: >> https://dzone.com/articles/cassandra-to-kafka-data-pipeline-part-1). >> >> And this starts to get into the complexities of hinted handoff as well. But >> fundamentally this seems something that would be a very nice feature >> (especially when you NEED it) to have in the core of cassandra. >> >> Finally, is the mutation hook in triggers sufficient to track all incoming >> mutations (outside of "shudder" other triggers generating data) >> > > > -- > Jonathan Ellis > co-founder, http://www.datastax.com > @spyced --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org