Which source replica does rebuild stream from?

2021-11-24 Thread Sam Kramer
Hi,

We're looking into adding a second datacenter to our cluster via a
rebuild, and we're curious on how Cassandra determines which source
replica to rebuild from in the source datacenter. For a bit more
context, we're using the Ec2Snitch with dynamic snitch enabled, and
are using NetworkTopologyStrategy for all of our keyspaces with RF =
3.

Looking at the source code, it appears that it's determined by the
closest replica in the source datacenter via the snitch
(https://github.com/apache/cassandra/blob/cassandra-3.11.11/src/java/org/apache/cassandra/dht/RangeStreamer.java#L226),
which I think is generally fine. Is this correct, or am I mis-reading
the code?

If so, there appears to be an edge case surrounding consistency which
I would like to clarify:

Assuming identical topologies, there is no strict guarantee that each
source replica is streamed over to the destination datacenter. This is
because we're using the snitch to determine proximity, which could
have removed a node from its own list for being down, or dynamic
snitch itself could've weighed it with a higher score.

As a result, when rebuilding each node in their respective racks, it
is totally possible for all racks to receive the same data from the
same source replica. Which, of course, may not be fully consistent?

Cheers,
Sam


Re: Which source replica does rebuild stream from?

2021-11-25 Thread Sam Kramer
Hi both, thank you for your responses!

Yes Jeff, we expect strictly correct responses. Our starting / ending
topologies are near-identical (DC1: A/B/C, DC2: A/B/C), and reads are
performed at LOCAL_QUORUM, while writes are done at EACH_QUORUM or ALL.

Thanks,
Sam

On Thu, Nov 25, 2021 at 9:38 AM Jeff Jirsa  wrote:

> The risk is not negligible if you expect strictly correct responses
>
> The only way to do this correctly is very, very labor intensive at the
> moment, and it requires repair between rebuilds and incrementally adding
> replicas such that you don’t violate consistency
>
> If you give me the starting topology, ending topology, and what
> consistency level you use for reads and writes I’ll describe the changes
> you have to do to do this safely
>
>
>
> On Nov 25, 2021, at 8:49 AM, Erick Ramirez 
> wrote:
>
> 
> Yes, you are correct that the source may not necessarily be fully
> consistent. But this risk is negligible if your cluster is sized-correctly
> and nodes are not dropping mutations.
>
> If your nodes are dropping mutations because they're overloaded and cannot
> keep up with writes, rebuild is probably the least of your problems. Cheers!
>
>>