Which source replica does rebuild stream from?
Hi, We're looking into adding a second datacenter to our cluster via a rebuild, and we're curious on how Cassandra determines which source replica to rebuild from in the source datacenter. For a bit more context, we're using the Ec2Snitch with dynamic snitch enabled, and are using NetworkTopologyStrategy for all of our keyspaces with RF = 3. Looking at the source code, it appears that it's determined by the closest replica in the source datacenter via the snitch (https://github.com/apache/cassandra/blob/cassandra-3.11.11/src/java/org/apache/cassandra/dht/RangeStreamer.java#L226), which I think is generally fine. Is this correct, or am I mis-reading the code? If so, there appears to be an edge case surrounding consistency which I would like to clarify: Assuming identical topologies, there is no strict guarantee that each source replica is streamed over to the destination datacenter. This is because we're using the snitch to determine proximity, which could have removed a node from its own list for being down, or dynamic snitch itself could've weighed it with a higher score. As a result, when rebuilding each node in their respective racks, it is totally possible for all racks to receive the same data from the same source replica. Which, of course, may not be fully consistent? Cheers, Sam
Re: Which source replica does rebuild stream from?
Hi both, thank you for your responses! Yes Jeff, we expect strictly correct responses. Our starting / ending topologies are near-identical (DC1: A/B/C, DC2: A/B/C), and reads are performed at LOCAL_QUORUM, while writes are done at EACH_QUORUM or ALL. Thanks, Sam On Thu, Nov 25, 2021 at 9:38 AM Jeff Jirsa wrote: > The risk is not negligible if you expect strictly correct responses > > The only way to do this correctly is very, very labor intensive at the > moment, and it requires repair between rebuilds and incrementally adding > replicas such that you don’t violate consistency > > If you give me the starting topology, ending topology, and what > consistency level you use for reads and writes I’ll describe the changes > you have to do to do this safely > > > > On Nov 25, 2021, at 8:49 AM, Erick Ramirez > wrote: > > > Yes, you are correct that the source may not necessarily be fully > consistent. But this risk is negligible if your cluster is sized-correctly > and nodes are not dropping mutations. > > If your nodes are dropping mutations because they're overloaded and cannot > keep up with writes, rebuild is probably the least of your problems. Cheers! > >>